x 

■/} 


&  f 


APPROVED  FOR  PJBUC  RELEASE 
DISTRIBUTION  UNLIMITED 


£*•  ’ 


1  -y  •  ■■<  '4  *,v  r."  •* 

p*  •  •'  ' •  i» )\  '  4«f '  -  *  •'  -{■* » »■;, * •*.  -. AAiSywjv.^ 

t  i  ,  f ,  ■ 

‘  ?'  ^  '  •-  is 

;  •  NV,  JUN  2  4  1303  ;  f 

is  yy-'  "v  ■  v  fc»,  • 

■  ■  :  ~SA-  = . . . A  .  .  f 


83  06  23  0  83  1 


d*ss,‘ 


*v  _  ■ 


ffejjU 


**;<?*3 

■  *6 

®3 


SHIP 


»»  «■■  ■  ■•  .  _  L-J  -*»L  ■•■■.■ .  JML!  .'-7 .  t  ~.  t  .••'.•■••■••■•■  -  .  i 


FINAL  REPORT 


RESEARCH  ON  SECURE  SYSTEMS  AND  AUTOMATIC  PROGRAMMING 
Period:  March  1,  1973  to  August  31,  1977 


Co-Principal  Investigators: 

Saul  Amarel 
(201)  932-3546/2001 

C.  V.  Srlnivasan 
(201)  932-2019/2001 


ARPA  Order  Number  2406 
Program  Code  Number  3030 


Grant  Number  DAHC15-73-G-6 


Department  of  Computer  Science 
Rutgers,  The  State  University 
New  Brunswick 

New  Jersey  08903  October 

*  p 


t 


,  1977 


SOSAP-TR-37 


/  / 

January  1977 


RELATIONS  BETWEEN  RECURSIVE  DEFINITIONS  AND  THEIR  EFFICIENT  ALGORITHMS 
M.  C.  Pauli 


t  •  - 

■  .  .  €J 


JUN  2  4  1983 


A 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

New  Brunswick,  New  Jersey 


This  research  was  supported  by  the  Advanced  Research  Projects  Agency 
of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6  to  the 
Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U.  S.  Government. 


1.  INTRODUCTION 


Typically  there  are  significant  differences  between  the  formulation 
of  an  algorithm  and  its  ultimate  implementation.  For  example  the  minimum 
path  between  two  nodes  in  a  weighted  di-graph  can  be  found  by  enumerating 
All  paths  between  the  two  nodes  and  choosing  the  smallest.  This  approach 
Can  easily  be  formulated  as  a  recursively  defined  function,  which  may  in 
turn  be  implemented  in  a  standard  way.  This  is  significantly  different 
than  Dykstra’s  algorithm,  the  favored  shortest  path  implementation.  On 
the  one  side,  close  to  the  problem  statement,  then  there  is  an  initial, 
simply  formulated,  but  often  inefficient  algorithm.  On  the  other  side, 
nearer  to  the  final  implementation,  is  an  efficient  algorithm.  The  study 
of  the  connection  between  these  two  is  the  subject  of  this  paper. 

It  will  be  assumed  that  the  initial  formulation  of  an  algorithm  is 
as  a  recursive  definition  and’  that  this  definition  is  in  a  standard  form 
(jto  be  given].  The  standard  form  was  chosen  because,  firstly,  it  is  one 
which,  in  our  experience,  has  frequently  arisen  naturally  as  an  initial 
algorithm  formulation.  Secondly  the  chosen  form  lends  itself  nicely  to  an 
overview  of  a  variety  of  possible  implementations  of  the  algorithm  thus 
.formulated.  The  recursive  definition  though  sufficient  to  provide  the  value 
of  the  function  anywhere  in  its  domain  is  non-deterministic  as  to  which  of 
•  variety  of  sequential  implementations  are  to  be  used  to  determine  that 
value.  The  variety  of  implementations  correspond  to  the  various  orders  of 
substitution  which  are  equally  valid  in  evaluating  such  a  definition. 

Some  orders  of  evaluation 


’  .  -2- 

become  possible  only  if  the  primitive  functions  which  enter  into  the  re¬ 
cursive  definition  have  appropriate  properties.  Different  orders  of 
evaluation  will  result  in  different  memory  requirements,  but  will  not  cause 
significant  time  differences  in  the  resultant  implementations.  This 
dependence  of  memory  requirements  on  the  order  of  evaluation  is  the  main 
subject  of  section  2  of  this  paper. 

Implementation  of  the  recursive  definition  generally  requires  the  re¬ 
petitive  execution  of  similar  operations.  If  it  can  be  shown  that  some 
pairs  of  these  operations  will  yield  the  same  or  similar  intermediate  results 
at  different  points  in  the  computation— then  only  one  such  intermediate  result 
need  be  computed  and  remembered.  It  may  then  be  accessed  from  memory  when 
needed  again  instead  of  being  recomputed.  This  can  happen  many  times  in  a 
sufficiently  systematic  way  so  that  a  significant  time  saving  can  be  realized.. 
Tlie  existence  of  this  situation  depends  on  properties  of  the  primitive 
functions  which  compose  the  recursive  definition.  In  section  3  a  significant 
class  of  problems  for  which  time  efficient  implementations  are  available  is 
considered. 


-Related  Work 

Tbe  work  reported  here  is  in  an  area  of  study  in  which  there  have 

been  a  number  of  significant  publications.  Strong  has  identified  a 

class  of  recursive  definitions  for  which  memory  efficient  implementations 

[5,6] 

(called  ’flowcharts')  are  available.  This  class  is  defined  in  terms  of  a 


recursive  scheme  whose  constituent  primitive  functions  are  virtually  un- 
.  restricted.  If  the  properties  of  these  primitive  functions  are  restricted 
somewhat,  a  wider  class  of  recursive  definition  forms  will  yield  similar  memory 


efficient  implementations.  Such  restrictions  are  considered  here  because 
they  arise  naturally  in  practice.  So  this  aspect  of  the  work  can  be 
considered  an  extension  of  Strong's  results. 

Burstall  and  Darlington  studied  properties  of  recursive  definitions 
whose  existence  allows  efficient  implementation,  with  one  objective  be¬ 
ing  the  incorporation  of. a  search  for  such  properties  in  an  optimizing 
[2] 

compiler.  Later  Burstall  and  Darlington  extended  this  study  to  consideration 

of  transformations  of  recursive  definitions  which  are  likely  to  produce 

[3] 

better  implementations .  The  spirit  of  our  work  here  is  largely  in  tune 
with  that  of  these  investigators  with  some  significant  differences  in 
eaphasis  and  in  the  particular  properties  studied.  Our  emphasis  has  been 
mainly  on  understanding  the  complete  set  of  properties  which  allow  the 
transformation  from  an  initial  recursive  definition  to  the  best  algorithms 
actually  known  and  to  the  proof  of  this  connection.  Thus  we  tend  to 
consider  relatively  complex  sets .of  properties  and  transformations  as 
opposed  to  many  simple  properties  and  transformations.  We  also  study 
mainly  one  form  of  first  order* recursive  definitions,  rather  than  the 
many  forms  they  consider. 

In  a  more  general  way  this  work  is. also  related  to  work  in  AI  and  the  design 
of  algorithms,  to  which  specific  reference  will  be  given  at  the  appropriate 
-point  in  the  paper. 

the  remainder  of  this  introduction  is  devoted 'to  a  sketch  of  the 
^definitions  and  results  to  be  detailed  in  the  subsequent  sections  of 
the  paper. 

•  First-order  means  a  definition  in  which  the  defined  function  symbol  never 
appears  nested  on  the  right. 


/  Appendix  I  contains  a  summary  of  most  of  the  notation  used  in  the 
paper.  (This  notation  is  also  defined  on  first  use  in  the  paper.) 


The  Standard  Form 

This  paper  concerns  the  implementation  of  recursive  definitions  of  a 
function  f(X)  in  a  class  F  in  which  every  definition  has  the  following 
form: 

ff(X)  »  q(X)  if  T(X)  (terminal  condition 

and  values) 

f(X)  -w(f(0l(X)),  ...,fCom(x)(X)))  ifT(X)  (body) 

initially  XeD^  (domain  of  function  f) 

where  the  data  structure  XeD^,  primitive  functions  w,qfo^eO,m,  and  predicates 
T  in  the  definition  collectively  designated  by  the  tuple  <D,w,q,0,m,T>  must 
be  constrained  so  as  to  make  I  a  terminating  *.  -definition. 

A  definition  is  terminating  .  if  for  each  deDf  the  sequence 

of  expressions  resulting  from  substitution  for  forms  f(o)  (where  a  is  any 
expression  using  I  which  starts  with  f(d),  d  e  D^(  and  next  produces 
w(f(o1(d)),  ...  ,  •  etc*  has  the  Pr°Perties: 

(1)  It  is  always  possible  to  evaluate  T(a)  and  if  T(o)  is  false 
it  is  always  possible  to  evaluate  m(a),  and  o^(a)  for  l<i<m(a) 


\  n  u  i.  n  m,  j  j‘.  j;  .■  '■«  .■  T"  T  V  ■.'■ 1  V-JVV-'I 

-s- 

»  *  ' 

(2)  Independent  of  the  order  of  substitution  for  the  different 
appearances  of  the  form  f(a)  after  the  same  finite  number  of 
such  substitutions  a  'terminal*  expression  will  be  obtained 

in  which*  for  every  appearance  f(a)4  a  is  terminal  (i.e.  T(a)  is 
.true)  and  q(a)  can  be  evaluated. 

(3)  The  function  w  is  defined  so  as  to  make  it  possible  to  evalu¬ 
ate  the  terminal  expression  in  any  order  consistent  with  its 
parentheses  structure. 

The  tuples  <D,w,q,0,m,I>  which  satisfy  the  above  constraint  are 
members  of  the  set  V.  The  set  of  definitions  of  form  I  which  satisfy  these 
constraints  constitute  the  recursive  scheme  F(V). 

This  form  of  definition  often  arises  in  practice  as  an  initial  solu¬ 
tion  to  an  algorithm  design  problem,  particularly  when  the  problem  can  be 
viewed  as  requiring  an  enumeration  dr  an  enumeration  followed  by  a  selec¬ 
tion  (search).  The  examples  of  recursive  definitions  in  F(V)  given  below 
arose  from  adopting  such  a  point  of  view.  Their  structure  can  be 
easily  seen  by  evaluating  them  for  some  small  initial  values  of  their 
arguments. 


Examples: 

Ex.  1.1  If  f(X)  is  to  be  the  set  of  all  n  bit  binary  numbers  (let  W 
be  the  set  of  positive  integers),  then: 


X  c  (<o,n>|a  a  string  of  0's  and  l's,  n  c  W) 
f(X)  »  f (o,n)  «  (o)  if  n  »  0 

f(X)  ■  f(o,n)  ■  fCa7<0>,n-l)  u  f  (a^<l>,n»l)  if  n  >  0 
where  is  string  catenation,  andu  set  union. 

X  initially  c  (<X,n> Jne/V> 

Then  ex.  f(X,2)  -  f(<0>,l) u  f(<l>,l)  -  (f(< 0,0,0)  u  f (<0.1>,0))  u  f (<1>,1) 


■  (<00>)u  f(<0,l>,0)  u  f(<l>,l)  ■  etc. 


'«  l 


•6- 


Ex.  1.2  If  f(X)  is  the  set  of  all  permutations  of  the  first  n  integers, 
X  e  {<n,a>|rieW,  a  is  a  string  of  positive  integers) 


ff(X)  »  f (n,a)  -  {o} 


if  n  -  0 


and  if  p  «  lot |  *  the  length  of  o  then 


f(X)  *  f(n,o)  =  f(n-l,a[a0j-n])  u...  uf(n-l,  ata^-^-n))  if  n  >  0 
'  where  a[a^+«-  n]  is  an  inserting  function;  i.e. 

if  o  *  <Oj,...,ap>then  a[a^+«-  n]  is  the  result  of  inserting  the 

(integer  n  after  component  cu  in  a  or  is  <a^$ . . . »ai_i»n,ai+1>  ...  ,ap>. 

X  is  initially  e  {<n,A>|neW} 

Then  ex.  f(2,X)  =  f(l,<2>)  =  f(0,<12>)  u  f(0,<21>)  »  {<12>)  u  f(o,  21>) 

*•  n  :  <<12>)u  (<21>) 

Ex.  1.3  f(XJ  is  the  string  of  moves  (each  a  pair  of  numbers  <a,b> 

meaning  move  a  disc  from  pin  a  to  pin  b)  necessary  to  optimally 
solve  the,  now  classical.  Tower  of  Hanoi  puzzle.  To  move  n 
discs  initially  on  pin  1  to  pin  2: 

X  e  {«x,y,z>,n>  |<x,y,z>  is  a  permutation  of  <l,2,3>,n  e  W) 


*(«x.y,z>,n>)  »  <x,y> 

f(«x,y,z>,n>)  «  f («x,z,y>,n-l>X^f(«x,y,z>,l>)^ 


if  n  -  1 


1  f(«z,y,x>,n-l>)  if  n  >  1 

lx  is  initially  c  {«l,2,3>,n>  |neW) 

Algorithms  to  Implement  Definitions  in  F(V)  which  are  Efficient  in  Use  of  Memory 


An  'algorithm  scheme'  defining  a  set  of  algorithms  is  defined  m  a  manner 
analogous  to  that  used  in  defining  a  recursive  scheme  like  F(V).  In  this  paper 
algorithm  schemes  generally  will  involve  standard  assignment  and  conditional 
statements  using  the  same  unspecified  set  of  data-structures  D,  primitive 
functions  w,q,0,m  and  predicate  T(designated  by  the  tuple  < D,w,q,0,m,T>) 


used  in  defining  the  recursive  scheme  F(V).  If  we  constrain  the  selection 
of  tuples  to  be  a  member  of  a  set  V,  the  set  of  algorithms  thus  defined 
is  designated  S(V)  and  a  particular  algorithm  e  S(V),  corresponding  to  a 
tuple  v  c  V  is  designated  S(v).  The  recursive  and  algorithm  scheme 
F(V)  and  S(V)  are  equivalent  iff  for  each  v  c  V,  F(v)  is  equivalent  to 
S(v).  A  recursive  function  definition  F(v)  and  an  algorithm  S(v)  are 
equivalent  if  with  domain  D  in  v,  for  every  d  e  l),'  the  value  of.  f(d)  as 
computed  with  recursive  definition  F(v)  *  value  of  the  result  of  running 
the  algorithm  S(v)  with  d  e  D  as  its  initial  value.  It  is  easy  to  find  a 
number  of  algorithm  schemes  equi valent  to  F(V).* 

A  main  purpose  of  this  paper  is  to  show  that  for  a  set 

V1,  built  from  V  by  constraining  the  function  w  to  be  'associative' 

‘2 

and  the  set  of  functions  0  to  have  an  'inverse',  there  is  an  algorithm 
scheme  S(V')  equivalent  to  F(V')  which  is  particularly  efficient  in  its 
use  of  memory.  The  algorithm  scheme  available  when  these  conditions  are 
satisfied  is  given  in  figure  2.2.  The  algorithm  scheme  S(V')  is  given 
in  terms  of  the  data-structures ,  primitive  functions  and  transformations 

of  these  primitive  functions  (inverse  of  0  for  example)  which  are  immediate¬ 
ly  available  under  the  assumption  of  the  existence  of  an  'inverse',  that 
appear  in  the  equivalent  recursive  scheme  F(V'). 

For  many  of  the  recursive  definitions  in  the  class  F(V'),  the  equi¬ 
valent  member  of  the  class  S(V')  -  which  can  be  obtained  mechanically  from 
the  recursive  definition  is  the  'good'  algorithm  usually  used  to  realize 
that  definition.  Thus  corresponding  to  example  1.1,  the  algorithm  ob¬ 
tained  by  instantiation  of  that  particular  D,w,q0,m  and  T  in  S(V') 
is  one  in  which: 

1  Theorems  2.1  and  2.2 

2  These  terms  arc  defined  in  section  2.  An  inverse  operation  plays  a  similar 
role  in  (6).  Our  'inverse',  however,  is  different,  having  been  independently 
developed  [7,8)  in  combination  with  associativity  to  delineate  another  class 
of  definition  with  efficient  implementations. 


First  a  string  of  n  0's  is  formed  and  outputted  -  being  the  first 
binary  number  produced,  then,  because  the  rightmost  symbol  in  the  string 
is  a  0  it  is  changed  to  a  1  and  the  result  outputted.  In  general,  the 
algorithm  remembers  the  last  binary  number  formed  and  outputted,  say  X. 

The  next  binary  number  is  formed  by  a  scan  of  the  bits  of  X  starting  vr’-h 
the  rightmost  bit,  and  changing  them  by  the  following  scheme.  Let  b  1 
the  bit  under  scrutiny  -  if  b  is  a  0  it  is  changed  to  a  1  and  the  rest 
is  the  next  binary  number  to  be  outputted  -  if  it  is  a  1  it  is  change^  • 
a  0,  b  becomes  the  bit  in  X  one  position  to  the  right  of  the  current  b 
and  the  scrutiny  is  repeated.  When  the  leftmost  bit  of  a  number  X  be¬ 
comes  b  and  that  bit  -  1  then  the  process  terminated.  In  summary  this  al¬ 
gorithm  for  producing  all  n-bit  binary  numbers,  consists  simply  in  'adding 
1*  to  produce  successive  members  of  the  set.  It  is  the  'good'  algorithm 
for  producing  the  set.  It  keeps  in  memory  only  the  last  number  produced  thus 
using  an  amount  of  storage  roughly  equal  to  that  required  to  hold  the  argu¬ 
ment  of  f  in  its  recursive  definition.  This  is  characteristic  of  all  the 
algorithms  in  S(V')  in  relation  to  the  equivalent  member  of  FCV')  and  is 
the  'memory  efficiency'  mentioned. 

In  a  similar  way,  the  algorithm  for  example  1.2  obtained  by  instantia¬ 
tion  of  the  primitives  that  appear  in  the  recursive  definition  in  example 
.2.2  produces  one  permutation  at  a  time.  A  permutation  is  produced  from  the 
previous  permutation  by  interchange  of  adjacent  terms.  This  again  is  the 
'good'  algorithm  for  generating  permutations. 

Creating  an  Inverse 

In  examples  1.1  and  1.2,.  the  given  0- functions  had  an  inverse  -  in 
example  1.3  the  0- function  as  given  does  not  have  an  inverse  and  thus  the 


T 


algorithm  scheme  S(V')  is  not  available.  However,  as  will  be  shown  -  when 
in  a  recursive  definition,  f  «  F(V),  the  O-function  does  not  have 

•n  inverse  -  a  simple  transformation  of  f  to  an  equivalent 

definition,  say  f',  involving  an  O-function  having  an  inverse 

can  always  be  found  in  F(V').  Thus  f'  will  have  an  equivalent  in 

S(V').  This  new  definition  f'  is  equivalent  to  f  in  the  sense  that  to 
each  argument  d  of  f  there  is  a  'simply'  computed  argument  d'  of  f' 
such  that  f'(d')  =  f(d).  Using  this  transformation,  an  equivalent  defini¬ 
tion  to  that  of  example  1.3  will  be  given  subsequently,  whose  equivalent 
algorithm  in  S(V')  will  produce  the  moves  necessary  to  solve  the  Tower 
of  Hanoi  problem  -  one  at  a  time,  the  only  temporary  memory  necessary 
being  that  for  a  record  of  the  previous  move  and  its  number. 

Memory  Efficiency 

In  the  standard  compiler  implementation  of  a  recursive  definition 
of  the  form  of  I^that  definition  is  taken  to  describe  a  procedure  which 
calls  itself.  The  procedure  uses  a  stack  to  temporarily  remember,  amongst 
other  things  the  set  of  arguments(=the  data  structure)associated  with  the 
call.  The  sure  to  which  the  stack  grows  varies ^and  depends  on  the  depth 
of  the  calls.  In  general  ?  if  the  definition  is  non-linear, -i.e.  has 
more  than  1  call  of  the  defined  function  on  the  right,  then  the  arguments 
of  the  w  functions  will  have  to  be  stacked  also.  Wheri  the  memory  eff¬ 
icient  algoithm;to  be  described  here  ji.s  applicable  then  both  of  these 
stacks  can  be  eliminated.  Instead  only  1  copy  of  the  argument  of  the  call¬ 
ing  function  will  be  saved.  All  other  temporary  memory  uses  in  the 
algor  thm  are  comparable  to  those  in  the  standard  implementation.  It 
will  be  possible  to  eliminate  the  need  for  these  stacks  for  any  definition 
of  form  I,  provided,  only,  as  we  have  said  that  the  w  function  is 
associative  and  the  0  functions  have  a  uniform  inverse. 

Although  the  'memory  efficient'  algorithms  of  S(V')  are  honestly 

so  for  the  most  part,  the  nature  of  the  memory  efficiency  can  be  mis¬ 
leading.  The  ioplementing  algorithm  available  when  w  is  'associative' 
and  the  O-function  has  an  'inverse*  is  efficient  in  the  sense  that  the 
■emory  required  is  usually  of  the  order  of  the  largest  storage  required 
for  the  argument  (also  called  a  data  structure)  of  f  which  arises 

If  f  is  evaluated  by  successive  substitutions. 

Usually  this  largest  data-struette  for  which  memory  need  be  provided 
•  •'"Theorem  2.3. 


requires  a  snail  amount  of  memory  relative  to  the  total  of  all  data-structures 

produced  during  the  implementation  of  tjie  definition  for  a  given 
initial  data-structurc  -  ex.  of  the  order  of  a  single  member  of  a  set 

-when  a  set  is  being  enumerated.  Even  when  the  'inverse*  does  not  exist 

it  can  be  incorporated  as  previously  noted,  leaving  the  'memory 

efficiency'  notion  still  viable.  However  there  is  another  way  of  obtaining 

a  'memory  efficient*  equivalent  algorithm  which  is  deceiving. 

This  technique  involves  obtaining  a  technically  correct  equivalent 

recursive  definition  of  f,  say  f'  having  only  one  occurrence  of  f'  on  the 

right,  but  in  compensation  involving  much  larger  data  structures  X'  and 

complex  function  o!^  than  the  corresponding  X  and  o^  of  f .  That  is  *  for  each 

definition  of  form  I  there  is  an  equivalent  definition 

« 

of  the  form: 

{f'(X')  «  q'(X')  if  T(X') 

f'(X«)  «  w(f*(o*(X')))  if  T(X') 

Initially,  X’  e  D^, 

By  equivalent,  we  mean  that  there  is  a  1-1  correspondence 

g  between  and  D^,  so  that  for  each  d  e  D: 

f(d)  -  f’CgCd)) 

,  / 

If  f*  has  an  inverse  then  it  can  be  realized  in  the  same  memory  efficient 
manner  as  other  definitions  in  F(V')  and  if  not  it  can  easily  be  modified 

fo  as  to  have  one  while  still  keeping  the  result  in  the  form 
of  II.  Memory  efficiency,  however,  means  that  the  memory  requirement 
will  not  exceed  the  size  of  the  largest  data-structure  which  arises  as  an  argu¬ 
ment  of  f  during  evaluation  of  f'.  But  in  this  equivalent  definition  that 

*  Theorem  2,  1  and  theorem  2.  2  give  the  two  classical  ways  this  is  done, 
called  breadth-first  and  depth- first  respectively. 


11 


data-structure  is  typically  much  larger  (often  exponentially)  than  that 
which  could  arise  in  the  original  definition. 

The  term  ’memory  efficiency*  as  used  here  then  requires  caution 
in  its  application. 

Time  Efficient  Implementations 

As  noted  earlier  the  opportunity  for  time  efficient  implementations 
of  recursive  definitions  in  F(V)(»  F  from  here  on)  arises  when  repetitive  use 
of  the  same  operations  are  necessary  in  the  evaluation  of  the  function. 

In  the  time  efficient  implementations  the  originally  repeated  operations  are 
done  once  -  the  result  being  remembered  for  later  use.  This  is  classically 
called  'pruning*  in  the  Artificial  Intelligence  literature.  For  some  sub¬ 
classes  of  F  the  nature  of  the  repeated  operations  are  sufficiently  inde¬ 
pendent  of  the  particular  initial  data-structure  so  that  one  can  design 
a  class  of  implementations  guaranteed  to  be  more  efficient  in  all  cases 
than  the  standard  implementation.  An  example  of  such  a  subclass  is  all 
functions  which  are  substitutionally  solvable  and  are  of  the  form  given  below. 

Let  V  »  <Xj,X2#... ,xn>  ■  a  vector  whose  values  are  integers., 

£.  «  <C,  ,...,C.  >  »  a  constant  vector  of  integers  -  is  vector  subtraction. 
x  xl  xn 

(f (V)  ■  q($)  if  ^  is  any  of  a  finite  set  of 

integer  vectors,  say  K 

ftf)  -  w(f( V-fy,...*^))  if  *  t  K 

Initially  V  c  a  defined  set  of  integer  vectors,  say  L 
A  specific  simple  member  of  this  subclass  is  the  definition  for  the 


12- 


Fibonnacci  series  (V  is  a  one  dimensional  vector) 

(f  (n)  «  1  if  n  ;  1,  or  0 

f(n)  »  f(n-l)  «■  f(n-2)  if  n  >  1 

Initially  n  e  W 

• 

For  any  definition  in  this  sut,class  in  which  V  has  m  components  it  is 
easy  to  see  that  the  function  can  be  implemented  by  m  nested  DO- loops. 

U»is  implementation  requires  much  less  than  the  exponential  time  involved  in  a 
straightforward  implementation  of  the  recursive  definition  -  which  does 
not  take  advantage  of  repeated  suboperations.  Although  detection  of  a 
member  of  this  subclass  is  relatively  easy,  the  detailed  dependence  of  the 
parameters  of  the  nested  DO  implementation  on  the  properties  of  w  and  the 

and  the  sets  K  and  L  in  this  subclass  of  definition  is  an  interesting  study. 
This  study  however  is  not  carried  out  here. 

Another  such  subclass  called  the  'explicit  history'  class  will  be 
considered  here.  The  evaluation  of  members  of  this  class  will  be  shown 
to  be  equivalent  to  solving  a  set  of  equations*  analogous  to  sets  of 
linear  equations.  Consequently  an  algorithm  analogous  to  Gaussian 
elimination  will  be  shown  to  be  an  available  implementation  of  explicit 
history  definitions.  This  algorithm  has  polynomial  complexity  as  opposed 
to  the  exponential  time  required  in  a  straightforward  implementation  of 
the  recursive  definitions  in  this  class. 


•  Theorem  3,1, 


Equivalent  Formulations  of  Recursive  Definitions 

To  be  in  the  class  F(V)  recursive  definitions  must  be  of  form  I, 

Despite  this  constraint  a  function  may  have  a  number  of  different 
definitions  all  in  F(V).  Some  of  these  may  have  desireable  properties 
Absent  in  others.  Transformations  which  can  be  used  to  obtain  equivalent 
definitions  in  FCV)  with  desireable  properties  aTe  developed  in  section  3 
of  this  paper.  Actually,  example  1.1  gives  a  definition  of  the  set  of 
all  n-bit  binary  numbers  which  may  not  be  entirely  natural.  One  which 
may  be  considered  is  based  on  the  fact  that; 

The  set  of  all  n-bit  binary  numbers  =  the  set  of  all  (n-l)-bit 
binary  numbers  each  with  a  0  appended  together  (unioned)  with  the  set  of 
all  (n-1) -bit  binary  numbers  each  with  a  1  appended. 

A  formal  statement  of  this  definition  is: 

Ex.  1.5  f  f  (n)  *  {X)  if  n  *»  0 

I  f(n)  «  (<o>@  f(n-l))u  (<1>  0f(n-l))if  n  >  0 

j  where  if  B  is  a  set  of  strings  and  a  a  string  o.^B  »  {a  //b]bcB} 
V  Initially  n  c  N 

Hien  ex.  f(2)  =  <o>@f(l)  u  <l>0f(i)  =  <o>  Q)  (<0>  g,  f  (o)  „ 

,  ^K^ffo))  u  <l>^f(l) 

-  <0>@{<0>,<1>}0  <l>0f(l)  a  {<00>  <o,l>}u 

<i>C?)f(i)  =  etc. 

This  definition,  though  still  strictly  in  form  I  is  not  in  F(V') 
because  w  is  not  associative.  Therefore  the  scheme  SCV*)  is  not  directly 
available  for  its  realization.  However,  there  are  theorems*  which  will 


Theorem  3,2 


14- 


in  this  case  give  ah  equivalent  definition  in  F(V*)  thus  once  again  making 
the  realization  of  S(V')  available. 

In  fact,  the  definition  of  example  1.1  can  be  obtained  by  the  application 
of  such  theorems  to  example  1.5.  It  is  interesting  to  conpare  the  above 
interpretation  of  the  definition  in  example  1.5  with  one  for  the  equivalent 
definition  in  example  1.1  which  was  originally  claimed  to  be  natural. 

Interpretation  of  Definition  1.1: 

The  set  of  all  n+|a|-bit  binary  numbers  which  have  a  prefix  a  * 

The  set  of  all  n+|a|-bit  binary  numbers  having  a  prefix  a  followed 
by  0  together  (unioned)  with  the  set  of  all  n+|a|-bit  binary  numbers 
having  a  prefix  a  followed  by  1. 

2.  MEMORY  EFFICIENT  IMPLEMENTATIONS 

The  first  part  of  this  section,  thru  page  29,  is  largely  devoted 
to  material  which  is  probably  familiar.  This  is  done  inorder  to  develop  the 
definitions  of  a  number  of  terms  which  are  used  later  in  this  section.  Altho 
the  concepts  are  familiar  the  terms  we  use  may  not  always  be  so.  Two  well 
known,  'classical'  implementations  of  recursive  definitions,  both  of  which 
use  stacks  are  shown  to  valid  in  these  preliminary  pages.  This  is  done 
for  comparison  with  the  'inverse'  implementation,  which  uses  no  stacks, 
and  whose  description  and  justification  is  the  main  objective  of  this 
section.  The  'classical  implementations  are  described  in  a  somewhat 
unusual  way,  different  than  the  flowchart  form  used  for  the  'inverse' 
implementation.  Their  validity  is  established  in  this  form  which  was 
thought  to  be  an  excercise  of  sufficient  inter  est  to  justify  inclusion 
here  since  the  form  in  which  these  implementations  are  given  is  generally 
applicable.  The  inverse  implementation, for  example,  could  be  given  in 


- 


this  form.  In  any  case,  the  material  in  the  preliminary  pages  can  easily 
be  skipped  and  only  referred  to  to  pick  up  definitions  of  terms  used  later, 
without  losing  the  main  point  of  the  paper. 


Definition  of  Standard  Recursive  Scheme  F: 


Consider  the  set  F1  of  all  functions  f  that  can  be  defined  as 


follows: 
Def.  2.1 


form  /  f(X)  *  w(f(0l(X)),  ...  ,  f(om(x)  (X)))  if  T(X) 

^  initially  X  e  Df 

where  the  primitive  functions  and  predicates  which  are  used  in  the  de¬ 
finition  are  weakly  constrained  as  to  the  nature  and  extent  of  their 
domains  and  ranges.  is  the  set  of  initial  data-structures  and  may  be 
any  set.  Other  sets  must  be  included  in  some  of  the  domains  of  some  of 
the  primitive  functions.  These  other  sets  are  defined  recursively,  using 
the  primitive  functions.  First  these  sets  are  named  and  their  relation  to 
the  primitive  functions  given,  then  they  are  defined. 

m  is  a  function  whose  domain  must  include  the  set  and  whose  range  is  the 
positive  integers  a  1.  m(X)  S:  1  for  all  X  e  Af 
0  is  a  set  of  functions  {olfo2,  ...  }. 

The  domain  of  O  must  include  the  set  a1,  Af  i$  the  union  of  all  the  domains 
of  the  functions  in  o^  and  is  called  the  domain  of  the  O-function, 

The  range  of  o^^  must  include  the  set  p*, 

The  union  of  the  sets  p^  of  all  the  functions  in  0  is  the  range 


f(X)  *  q(X) 


if  T(X) 


of  the  0- function^ and  is  called  P^, 

T  is  a  predicate  whose  domain  includes  u  Pf.  Its  range  is 
(true,  false) 


q  is  a  function  whose  domain  must  include  Q^,.  Its  range  may  be 


any  set,  say  w*. 


W  is  a  function  whose  range  is  called  W^  and  whose  domain  must  include 

The  sets  named  above  are  defined  as  follows  (the  subscript  f  is 
dropped  where  it  is  not  essential) : 

A*  *  (d|d  c  D  and  T(d)};  and  for  j  >  1 

A*  •'  {<^00 |X  e  A^"1  and  i  £  m(X)  and  TCo^X))} 

a  .  or.,  a* 

The  set  6*  of  oi  c  0  is: 

6^  *  {X  |  X  e  A  and  i  s  m(X)} 

The  range  of  o  is : 

P  -  (o-CXJlX  e  A  and  i  s  m(X) } 

The  range  p*  of  oi  e  0  is: 
p1  «’  fo.(X)  |X  c  A1} 

* 

The  set  of  terminal  data- structures  Q  is: 

Q  ■  P  -  A 

•  *  • 

The  set  W  is  defined  as  follows: 

W*  ■  ,{w(Xj,...,Xn)  |X^  ew£,n  ■  a  positive  integer  tf);  for  j  >  1, 

jx4  c  Wk,k(j);  n  e  N) 

«  .  u*./ 


If  in  addition  to  being  a  member  of  the  set  F',  a  recursive 

definition  is  terminating _ ;  as  defined  below  it  is  a  member 

of  the  set  V.  We  need  some  preliminary  definitions. 

If  <i, ,  ...  i  >  ■  I  is  a  sequence  of  integers  then  o  .  .  (X) 

•  *•  •  •  •  | 

Ov(X)  is  an  abbreviation  for  o4  (...  o.  (o.  (X))  . ..);  o.(X)  ■  X. 

1  l2  J 


17- 


A  length  1  sequence  of  integers  <i^>  is  applicable  to  a  data- 

structure  X  c  A^  if  ij  S  m(X).  A  length  n  sequence  of  integers 

<ij#  in>  is  applicable  to  a  data- structure  X  if  <i^,  ...»  in 

is  applicable  to  X  and  <i  >  is  applicable  to  o  .  .  .(X). 

n  n-l 

1(X)  is  the  set  of  all  integer  sequences  applicable  to  X  e  A^. 

£  is  terminating _ iff  Vd,  d  eD  implies  T(d)  is  finite. 

Note  that  if  X (X)  is  finite  it  cannot  contain  an  infinite  sequence, 
because  it  always  contains  all  prefixes  of  any  sequence  it  contains. 

This  completes  our  definition  of  F*  Next  we  give  .  some  simple 
consequences  of  the  definition  which  will  be  used  later.  First,  the 
substitutional ly  solvable  property  that  d  e  D,  T(d)  is  finite  can  be 
extended  to  any  X  e  A^.  This  is  done  in  lemmas  2.1  and  2.2. 

• 

Simple  Properties  of  f  eF 

Lemma  2.1:  If  f  e  F  and  X  e  A^  then  3 an  integer  sequence  I  e  1(d) 
and  3  a  data-structure  d  e  D  such  that  o^(d)  ~  X. 


Proof: 


If  X  c  A^  then  obviously  there  exists  some  c  (at  least  1) 

g 

such  that  X  e  A  .  The  lemma  is  proven  by  induction  on  the  sets 
Assuming  there  is  a  length  k-I  sequence  1^  for  each  data- 


k-1 

Structure  Y  e  A  and  d  c  D  such  that  oT  (d)  *  Y.  Then  it  follows, 
by-  definition  of  A^  that  if  X  c  A^  then  X  **  o^OQ  for  some  i  s  m(Y) 


.k-1 


and  Y  c  A  \  Thus  X  *  o^Oj  (d))  =  (d).  Since 


also  D  =»  A  ,  and  o^(d)  «  d  for  each  d  c  D,  the  proof  is  complete. 
Lemma  2.2:  If  f  c  F  and  X  e  A^.,  then  I(X)  is  finite. 


Proof: 


From  the  previous  lemma  the  data-structure  X  =  o»(d)  for 
some  d  c  D  and  integer  sequence  I.  Therefore  1(d)  s  the 


•  P  as  defined  hero  is  tho  same  ns  F(V)  as  defined  in  the  Introduction. 


set  consisting  of  I  concatenated  With  each  member  of  I(X). 
Thus  if  7(X)  is  not  finite,  7(d)  cannot  be  finite  but  this 
contradicts  the  condition  that  f  e  F  is  substitutional ly 
solvable. 

Another  consequence  of  the  definition  of  F  is  that  the  data- 
structures  in  can  be  usefully  ordered  in  another,  almost  reverse, 
■aimer  than  the  ordering  by  membership  in  the  subsets  A^.  In  most  of 
the  subsequent  inductive  proofs,  induction  will  be  carried  out  on  this 
ordering. 

Ordering  the  Data-StTuctures  in  A  (Remoteness): 

For  any  function  f  in  F: 

We  say  a  data-structure  X  in  A^  u  Q£  is  of  remoteness  0  (or  is 
tezvinal)  if  X  e  Qf. 

We  say  a  data-structure  X  in  A^  u  Q£  is  of  remoteness  n  if: 

(1)  3i:i  S  m(X)  and  o-^X)  is  of  remoteness  n-1  and 

(2)  Vi:i£m(X)  implies  o^X)  is  of  remoteness  n-k  and  kstl.* 

•Lenta  2.3:  If  f  e  F,  then  there  is  a  function  r  with  domain'  Q£ 
such  that  if  X  e  then  r(X)  =  the  remoteness  of  X. 

Proof:  For  each  X  c  A£U  let  r(X)  be  the  maximum  of  the  length 

of  all  the  sequences  in  7(X).  For  each  X  e  A^  ($£,  X  is  of 

•  Alternately  this  can  be  phrased  ’of  remoteness  <  n’. 


remoteness  r(X).  This  is  shown  by  induction.  If  T(X)  then 
1(X)  is  empty  and  r(X)  »  0.  Assume  that  if  r(X)  <  n,  X  is 
of  remoteness  r(X).  Let  r(X)  =  n,  i.e.'there  is  a  longest 


sequence  of  length  n,  say  I  =  <i^,  ...  ,in>  in  ^(X).  Let 

O.  (X)  ■  Y.  Then  I*  =  <i_,  ...  ,i  >  is  in  I(Y).  Furthermore, 

4  n 

no  sequence  applicable  to  Y  is  longer  than  I'  because  other¬ 
wise  I  could  not  have  been  a  longest  sequence  in  I(X).  So 
r(Y)  **  n-1  and  Y  is  of  remoteness  r(Y)  *  n-1.  Therefore, 
since  oi(X)  =  Y  and  for  all  j  *  i1#  j  <Lm(X),  r(o^(X))  s  n-1, 

X  is  of  remoteness  r(X)  =  n  by  definition  of  remoteness. 


Interpretation  of  the  Recursive  Definitions  in  F 

In  the  next  paragraphs  we  briefly  sketch  some  important  well  known 
facts  about  the  interpretation  of  'a  terminating  recursive 

definitions  such  as  f  c  F. 

A  recursive  definition  f  c  F  defines  the  function  f  on  the  domain 
®£  giving  a  relation  (in  terms  of  the  primitive  function  w)  that  f(d) 
must  satisfy  with  the  same  function  f  at  some  different  argument  value 
namely  with  f(o^(d))'s  for  1  £  i  £  m(d)..  The  same  definition  is 
Applicable  to  define  f(o.(d))  for  each  l£i£m(d), 

^  et  still  other  arguments.  This  process  of  repeated  redefinition  of 
f  with  different  arguments  will  eventually  end  (because  f  j  F  is  substi- 
tutionally  solvable)  with  arguments  X  which  are  terminal,  at  which  point 
tlia  definition  of  f  gives  a  definite  value  *  q(X)  to  be  assigned  to  f(X) 

Thus  this  process  will  close,  and  a  definite  value  will  be  assigned  to 
It  can  easily  be  shown  that  this  is  a  unique  value. 


This  process  of  Te-definition  can  be  formulated-  as  a  non-determini stic 
procedure  involving  successive  substitutions  in  an  expression  whose  evalua¬ 
tion  will  give  the  value  of  f(d)  for  d  e  D^. 

Let  E*  be  an  expression  involving  a  composition  of  the  w,  and  o.  functions, 
ideD,  and  occurrences  of  f(a),  f  being  the  symbol  for  the  defined  function, 
nnd  <»  its  argument.  Let  E*  *  f(d)  and  in  general  to  get  E*+*  from  E*do  the 
following : 

Choose  any  occurrence  of  f(cO  in  E1.  Note  that  a  itself  will  never 
Contain  any  occurrence  of  f,  a  will  just  involve  a  composition  of  w's, 
o^*s  and  d.  Evaluate  a,  tL is  can  be  done  because  it  only  involves 
given  primitive  functions  and  a  given  data-structure  d. 

If  the  evaluated  a  is  terminal, i.e.  TCa)  is  true,  then  f(a)  is 

replaced  by  q(a)  to  obtain  E*+*.  Thus  the  right  side  of  the  1st 

equation  of  the  definition  of  f  is  substituted  for  f  (a)  .  If  on  the  otherhand 

TteO  then  the  value  of  a  is  substituted  for  X  on  the  right  side 

of  the  second  equation  in  the  definition  of  f ,  and  then  this  entire 

resulting  right  side  is  substituted  for  f(a)  in  E^  to  produce  E1 

Substitutions  are  continued  until  E*  contains  no- occur  - 

fences  of  f.  This  must  eventually  occur  because  f  is  substitutional ly 
solvable.  At  this  point  in  the  evaluation  E*  is  the  value  of  f(d). 

The  result  of  this  non-deterrainistic  procedure  starting  with  f(d)  may¬ 
be  interpreted  as  the  definition  of  f(d). 

This  definition  is  non-detcrministic  because  any  occurrence  of  fC<0 
in  E*  may  be  legitimatley  chosen  to  be  substituted  for  next.  No  order 


&s  prescribed. 


V 


21- 


A1  though  the  meaning  of  the  recursive  definition  is  tied  to  this  non- 
deterministic  unordered  procedure,  the  common  connotation  of  'recursive 
implementation'  involves  a  fixed  ordering  of  the  substitutions  for 
occurrences  of  f(a)  in  the  successive  expressions  E*.  This  order 
requires  substitution  always  for  the  leftmost  occurrence  of  f(a)  in  E*. 

This  is  the  order  implemented  in  virtually  all  compilers  which  accept 
recursive  definitions.  It  is  sometimes  called  depth-first  ordering.  This 
ordering  amongst  others  will  be  investigated  here.  We  call  the  depth-first 
ordering  the  standard  implementation — recognizing  that  strictly  there  is  not 

•  single  implementation  entitled  to  be  called  the  recursive  implementation. 

So  given  a  recursive  definition— and  the  order  in  vhicl:  the  f(a) 
occurrences  are  to  be  substituted  for — the  basis  of  a  deterministic  imple¬ 
mentation  is  established.  This  can  be  detailed  in  a  flowchart  and  is  one 
of  the  ways  in  which  we  will  specify  such  an  implementation. 

There  is  a  subset  of  recursive  definitions  of  f  e  F,  however,  in  which 
one  need  not  specify  the  order  explicitly.  There  can  be  no  question  of  the 
order  of  substitution  for  occurrences  of  f  00  when  each  expression  E*  has 
only  one  occurrence  of  f(o).  This  will  occur  for  any  definition  of  f  e  F 
Whose  second  equation  has  only  one  occurrence  of  f(a)  on  the  right.  Such 

•  unary  recursive  definition  itself  then  is  a  second  way  to  specify  an 
implementation, 

An  implementation  which  can  be  specified  in  one  of  these  forms  can 
•Iso  be  specified  in  the  other  form. 

In  the  subsequent  sections  both  ways  of  specifying  an  implementation 

•re  used. 


I 


,  *22? 


In  the  next  section  unary  definitions  are  developed  which  represent 

the  classical  depth-first  and  breadth-first  implementations  of  functions 

f  t  F.  By  showing  the  equivalence  of  these  unary  definitions  to  f  c  F 

the  validity  of  these  implementations  is  estrblished. 

Later  another  implementation  for  f  c  F  called  an  inverse  implementation 

will  be  described  by  a  flowchart  and  will  be  proven  to  be  valid,  tv-  . 

'  .  inis  inverse 

implementation, unlike  the  classical  implementations, does  not  require  a  stack. 

Classical  Implementations: 

Notation: 

We  need  notation  for  operations  which  replace  a  component  of  a  vector 
with  single  or  multiple  components  which  are  functions  of  the  replaced 
component. 

Let  L  be  a  vector  (list);  L  »  <1.,  ...  ,ln>. 

Let  t^  denote  an  individual  member  or  subsequence  of  L  which  has  some 
specified  properties  P^.  The  notation: 

LItj  ♦  Xjj  —  ;  tn  +  xnJ 

denotes  the  list  obtained  by  replacing  all  components  of  L  which  have 


property  by  X..  may  be  an  individual  component  or  a  number  of 
components.  For  example,  if  P^  is  the  property  of  being  an  odd  indexed 
component  of  L  and  if  n(=  the  number  of  components  of  L)is  even  then  the 
meaning  of  L[tj  ♦  a  ]  is  given  by: 

L£tj  a)  »  <a,l2»at'l4,a,  ...  ,ln> 


The  next  two  theorems  will  show  that  for  each  f  c  F  there  are  two 

Unary  definitions  Jg  and  FQ  both  also  members  of  F  and  both  equivalent  to 

f,  Since  they  are  unary  the  implementation 

Depth-first  and  Breadth-first  algorithms  are  described  in  (4)  as  algorithms 
for  searching  a  state  space.  Such  searches  (and  more)  can  usually  be 
Modelled  by  a  recursive  definition  of  our  standard  form  recursive  definition 
-  with  the  nature  of  the  transition  from  state  to  state  given  by  the  O-functions 
•nd  the  nature  of  the  search  given  by  the  w  function. 


for  evaluating  these  definitions  is  deterministic  (see  pg  19) .  The 
algorithm  Fg  is  similar  to  the  classical  ’breadth-first'  algorithm  and 
Fp  is  similar  to  the  classical  'depth-first'  algorithm. 

Breadth-First  Implementation 
Let  f  e  F,  thus 


I  1)  |f(X)  =  q(X)  if  T(X) 

2)  jf(X)  »  w(f(Oj(X)),  ...»  if  T(X) 

3)  ^initially  X  e  D 


To  define  the  function  Fg  which  is  equivalent  to  f  we  first  need  to  de¬ 
fine  a  number  of  new  primitive  functions  and  predicates  in  terms  of  the 
primitive  functions  Qw,q,m  and  predicate  T  of  f.  For  this  the  notation 
just  introduced  is  used. 

L  *<lj,  •••»  ln>»  Z  «<2j,  ...»  Zp>  are  both  vectors.  The  components 
of  L  are  either  brackets  in  the  set 
BRACK  t 

ox  members  of  A^  *  the  domain  of  f,  or  of  =  the  terminal  data- structures 
of  f.  (A£  and  are  assumed  not  to  contain  any  of  the  brackets  in  BRACK.) 
The  components  of  Z  are  either  members  of  BRACK  or  of  w^  »  the  range  of  q, 
er  of  *  the  range  of  w. 

With  X  c  A £,  t  c  and  t^  c  w^  u  let : 

0B(L)  «  LJX  ’{’.OjCX).  ...  ,  °B(x)  (X)»')*J 

Tg(L)  ■  true  if  every  component  of  L  is  a  member  of  BRACK  or  of 
Qb(L)  -  L[t  ♦  q(t) ) 

Mg(L)  -  1 

WB(Z)  -  ZpC.tj . tn.’)*  w(t1#  ...  ,tn)D 


Now  we  can  define  Fg: 


II  1)  fFB(L)  »  Qb(L)  if  Tb(L) 

2)  jFbCl)  =  Wb(Fb(Ob(L)))  ifTgCL) 

3)  ^initially  L  =  <  X>,  X  e  0 

Theorem  2.1:  For  each  f  e  F  (as  I  above)  there  is  a  function  F^  e  F  (as 
II  above)  such  that  Fgfc  X>)  -  <  f(X) >  for  all  X  e  D. 

Proof:  The  proof  uses  induction  on  the  remoteness  of  the  data- 

structures  X  e  u  which,  along  with  brackets,  constitute 

the  significant  components  of  the  vector  data-structures  L  e  A  . 

FB 

What  we  will  show  is  first  that  for  any  L  in  A  whose  components 

FB 

are  all  members  of  Q^,  designated  by  t,  or  are  in  BRACK,  that: 

fb(io  »  LCt  *  f  (t)] 

This  is  true  since  if  Tg(L)  is  true  then  with  t  e 
F0(M  «=  QbOO  by  II  1 

m  L[t  +  qCt)3  by  definition  of  Q0 
«  L[t  ♦  f(t)3  by  I  1 

•  Secondly  we  show  inductively  that  if  L  contains  component 

of  A£  designated  X  then 

00  fb(io  =  L[X  -  fCX)] 

Assume  that  as  long  as  all  components  of  L,  other  than  those 
in  BRACK,  are  of  remoteness  <  n  that  statement  (H)  is  true. 

If  T0(L)  is  not  true  and  all  components  of  L  other  than  those 
in  BRACK  are  of  remoteness  s  n,  n  >  0,  and  at  least  one  such 
component  is  of  remoteness  n,  then:  (X  is  used  to  designate  a 
member  of  A-,  t  a  member  of  Q,) 


FB(L) •“  WB(FB(0BCL)5)  hy  IIC1) 

PBa)  -  Wb(Fb(L[X^  •{'.o1(X),  ...  .Offl(x)(X),'}*] 

.  by  the  definition  of  0B 

PgO*)  *  Wg (Fg(L’))  abbreviating  the  expression  above  with  L' 
Clearly  all  components  in  L'  are  of  remoteness  <  n  and  at  least 
one  has  remoteness  n-1. 

So  the  inductive  hypotheses  may  be  used  fox  all  X  e  4^  in  L'. 

Fb(L)  *  WBCL’[X  -  f(X)]) 

But  V  «  L[X  ^.•P,o100,  ,  %W0C),»}'] 

So  L'  [X  f(X)]  *  L[X+  ‘{‘.fCOjCX)),  ...  ,  fComCX)(X)),*}'] 

So  Wb(L«[X  -  f  (X)  ])  *  L[X  w(f(0l(X)) . f(°mCX)(X)))] 

by  the  definition  of  Wg. 

Thus: 

Fg(L)  *=  L[X  f(X)] 

Thus  for  L  *  <X>,  X  e  D: 

Fb(L)  -  <X>rX  *►  f(X)]  *  <fCX)> 

Depth-First  Implementation 

»  1 

The  depth-first  function  Fg  equivalent  to  f  in  I  above  is  defined 
as  follows: 

III  1)  /Fd(L,J0  =  QD(L,k)  if  Td(L) 

2)  )FD(L,k)  =  F^0D(L,k3)  ifTD(L) 

3)  /initially  L  »  <X>,  X  e  Df,  k«l  (we  say  then  <L,k>  e  DQ) 

where  with  1^  »  the  kth  component  of  L  the  definition  of  Og,Tg  and  QD  are: 

1)  00(L.k)  -<L[lk-e  •{•o1(lk),  ...  .  k>  if  lk  c  tf 

2)  -  <L(lk  -  q(lk)],k+l>  if  lk  e  Qf 

3)  ...  $  tn,  '2  *  ♦  w(tj,  ...  k-n-l>.. 

..  if  lk  ■  *)*  and  t^  are  all  in  u  wf 

(and  '( ' >t. ,  ...  #t  ,  precedes  J  in  L  as  assumed) 


(4)  GpCL.lO  “<L,k+l> 


if  lk  «,'i'  or  if  lk  e  wf  u  Wf 


Tp(L,k)  «  true  if  |L|  ■  1  and  k  =  2. 
QD(L,k)  - 


Theorem  2.  2  If  f  (of  definition  I  above)  e  F,  then  Fn  (of  definition  III 


Proof: 


above)  is  also  s  F  and  for  d  e  D,  FD(<d>,l)  =  f(d). 

Again  the  proof  will  be  by  induction  on  remoteness. 

It  will  be  shown  that  if  lk  is  the  kth  component  of 

L  and  1^  c  A^u  then: 

(H)  FD(L.k)  *  FD(L[lk  -  f(lk)),k) 

This  is  certainly  true  if  lk  is  of  remoteness  0,  i.e.  if 
lk  e  Qf  because  then: 

Fjj(L,k)  *  FD(L[lk  *•*  q(lk)3i  k+D  by  definition  of  0D(2)  and  III  1 
*  FD(L(lk  *♦*  f(lk)],  k+1)  by  definition  of  f ,  I  1 
■  FD(b[lk  f(lk)],k)  by  definition  of  0D(4) 

Assume  (H)  is  true  if  lk  has  remoteness  <  n  £  0.  Then  if 

is  of  remoteness  n  >  0,  and  is  e  A^,  it  follows  from  0^(1)  that 

pDa.io  -  FDa[ik^  . on(ik)ak).,},].k) 

with  0.(1,.)  for  Isis  m(lu)  each  being  of  remoteness  <  n. 


*27-, 


Rewriting  the  above  in  expanded  notation  and  indicating  1^(=  the 
kth  component  of  L)by  underline: 

Fp(i»k)  a  F^fc  •••»  *  °l^k^  *  ****  °n(l 

Then  by  definition  0D(4): 

FD(L,k)  *  Fjj(<1^,  ••**  1 1 '  Ilfel »  *••» 

Since  o^l^)  e  Qf  and  is  of  remoteness  <  n  we  have  by  (H), 

and  then  by  an  application  of  0^(4)  again: 


FD(L,k)  »  fd(  i1#  ....  lk_r*{\  fCojCi^D.fio^l,  ...  , 

And  since  in  fact  for  i  *  1  to  m(lk),  o^(lk)  e  u  Qf»  and  is 
of  remoteness  s  n  by  repeated  application  of  (H)  and  0^(4) : 

Fp(b*k)  *  li#  •••»  •••»  ^(°m(i^)^ » 


•}»  >//L\  k+m(lj)  +  l) 


And  finally  by  0D  (3) : 


FjjChfk)  »  Fp6:  1^,  ...»  »  ._•  •  ( l ^ 

//L*,k) 

Since  f(lk)  a  »  ••• 

FDa,k)  -  Fpfclj,  ....  l^j.  f(lk)>//t',k) 

And  compacting  the  notation: 


F0(L,k)  •  FD(L[lk  <■  f(Xk)],k) 


For  L  ■  <d>,  k»l: 


F0(<d>,l)  *  FD(<d>[d  -  f(d)].l) 

■  FjjC<f(d)>,2>  since  f(d)  is  e  domain  of  w  Op  (4) 

is  applicable 

■  Fp(<f(d)>,2>  «f(d)>,2>  is  terminal  so 

-  f(d') 

Properties  of  f  e  F  Sufficient  for  Memory  Efficient  Implementations 

Another  implementation  more  efficient  that  the  two  classical  ones  is 
available  when  the  recursive  definition  f  c  F  has  some  special  properties. 

These  properties  are  now  defined. 

Associativity:  Associativity  has  the  usual  meaning  here.  The  function 
w  is  associative  if: 

•••#  am)  B  for  m  *  3 

w  »  minimum,  sum,  catenation  and  union  provide  examples  of  w- functions 
with  this  property.  In  each  case  one  can  compute  wfa^,  ...,  a^)  as  follows: 

x  ♦  r 

For  i  ■  1  to  m 
'  '  T  ♦  w(X,8i) 

X  ♦  Y 
End 

thus  requiring  at  any  one  time  memory  for  at  most  2  copies  of  the  result 
of  w(a^,  ...,  Sj),  j  S  m.  If  w  is  the  function  minimum,  this  memory  does  not 
increase  on  the  number,  but  only  on  the  value  of  its  arguments,  a^.  If  w  is  catena 
tion,  sum,  or  union  the  memory  required  will  increase,  albeit  at  different 
rates,  with  the  number  of  arguments.  There  is,  however,  a  significant 


difference  in  use  of  the  memory,  between  a  computation  of  catenation  and  of 
union.  To  obtain  catenate  (a,b),  b  needs  only  be  attached  at  the  end  of 
a.  To  obtain  the  union  (a,b),  a  must  be  searched  for  an  occurrence  of  a 
■ember  of  b.  If  a  represents  the  result  of  a  previous  computation  then  in 
the  union  case  it  is  necessary  to  re-access  this  memory  whereas  this  is  not 
necessary  in  the  catenation  case.  This  is  an  important  consideration  be¬ 
cause'  memory  that  is  not  re-accessed  can  be  located  in  areas  of  memory 
(disc)  which  need  not  be  easy  to  access  (as  is  core).  The  temporary 
memory  requirements  for  the  implementation  of  a  function  then  do  not  depend 
on  the  usual  mathematical  properties  of  that  function  only,  but  also  de¬ 
pend  on  the  means  available  for  accessing  the  memory?  Nevertheless,  for 
compactness  our  results  are  given  in  terms  of  the  usual  mathematical 
properties --so  caution  is  needed  in  their  interpretation. 


Uniform  Inverse: 

Consider  a  set  of  functions  H  =  {h. ,  ...»  hu).  Let  V.  be  the  domain 

X  M  X 

over  which  h^  is  defined  and  let  R^  be  the  corresponding  range  of  h^. 

Then  we  will  say  V  =  is  the  domain  of  H  and  R  =  is  its  range. 

■  *  .  •  • 

* 

The  set  of  functions  H  is  said  to  have  a  uniform  inverse  on  the 
domain  P  if: 

(1)  Every  h|H  has  an  inverse  and 

(2)  R£Rj=  $  for  every  R^  ^  R^in  R. 

If  H  has  an  uniform  inverse  then  it  is  easy  to  see  that  the 
following  two 'uniform  inverse' functions  on  R  exist  forffe  R^R. 


1.  It  is  also  true  that  there  may  be  some  advantage  in  time  efficiency  in 
ono  grouping  of  the  arguments  of  w  over  another  though  both  give  the  same 
result  when  w  is  associative.  An  example  of  such  a  function  is  merge,  i. 
merge  (a  ,  , ,,  ,un|)  in  which  a^  arc  each  finito  sorted  sets  of  numbers. 


(1)  H*  (r)=  d  D  such  that  H^(d)=r 

(2)  i„(r)®lthe  index  of  the  range  R.  of  which  r  is  a  member. 

A  recursive  definition,  fe  F,  has  a  uniform  inverse  if  the  set  of 
functions  Oj€ o  in  f  has  a  uniform  inverse. 

For  a  given  function  set  0  it  is  possible  that  none,  one,  or  two 
of  the  pair  exist.  Despite  the  fact  that  the  uniform  inverse 

is  a  strong  condition  it  does  often  occur.  Furthermore  when  it 
doesn't,  there  is  always  a  strongly  equivalent  definition  which  does 
have  a  uniform  inverse.  This  is  shown  after  a  short  degression  re¬ 
quired  to  define  strong  equivalence. 

Equivalence  of  Recursive  Definitions: 

Consider  two  definitions  in  F: 

(1)  f  on  domain  D 

'f(X)  »  q(X)  if  T(X) 

f(X)  =  wCfCOjCX)),  ....  f(oB(x) (X)))  if  T(X) 

^initially  X  ■  d  c  D 

(2)  g  on  domain  D' 

g(X’)  -  q’(X')  ifT'(X') 

-  g(X')  -  w'(g(o'(X»),  ....  g(o;,(x)(X')))  if  f'(X') 
initially  X'  -  d'  c  D' 


V  there  is  a  1-1  correspondence  between  D  and  D'  such  that  whenover 


-31- 


d  c  D  and  d'  e  D  are  two  corresponding  data-structures  f(d)  **  g(d*) 
then  the  two  definitions  are  equivalent.  The  above  correspondence  may  be 
extended  to  one  between  A^  and  A^  with  6  c  corresponding  to  i'  e  4^ 

by  having  o^(6)  correspond  to  oJ(6')  whenever  6  corresponds  to  6'  and 
Oj(6)  and  oj(6')  are  both  defined.  This  is  called  a  structural  corres¬ 
pondence.  If  in  addition  to  such  a  structural  correspondence  of  A^  to 
Ag  the  following  conditions  hold 

(1)  T(6)  *  T* (6*) 

(2)  q(6)  *  q'(6')  if  T(5)  (andT’CS')) 

(3)  m(6)  =  m'(6‘)  if  T(5)  (andf'(fi')) 

(4)  w  *  w* 

then  f  and  g  are  strongly  equivalent.  (Note  that  the  structural  corres¬ 
pondence  of  A^  to  Ag  need  not  be  1-1.  It  will  not  even  necessarily  be 

defined  on  all  members  of  A  and  A  unless  the  conditions  (1)  through 

—  A 

(4)  are  satisfied.) 

Strong  equivalence  of  two  definitions  implies  that  they  not  only  give 
the  same  results  but  also  require  the  same  number  of  substitutions  in 
their  evaluation  for  corresponding  initial  arguments. 

As  an  example  of  a  strong  equivalence,  consider  the  two  functions  f  and 
g  each  in  F: 


(1)  ff(X)  «  q(X) 


if  T(X) 


f(X)  *  w(f (Oj(X)) ,  ....  f(oB(x)(X)))  if  T(X) 
initially  X  =■  d  c  D 


(2)  fa)  g(X,Y)  -  q(X) 


if  T(X) 


b)  g(X,Y)  ■  wCgCOjCXD.hjCY)),  ....  g(ora(x)(X),hm(x)(Y)))  if  T(X) 
initially  <X,Y>  ■  <d,y^>  c  D*  with  dc  D  and  y^  e  aaet  ^ 

(H  ■  (hj,  ...»  h^j)  is  a  set  of  primitive  functions) 


Let  data-structure  de  D  correspond  to  <d,y>  e  D'.  Extend  this 

d 

correspondence  to  one  between  A^  and  A^  by  letting  o^.  CX)  e  A^  correspond 
to  <0^(X)  h^(Y)>  e  Ag  whenever  X  e  A^  corresponds  to  <X,Y>  e  Ag  and  T(X) 
and  i.«[  m(X).  For  example  if  d  e  D  and  o^(d)  is  defined  then  it  corres¬ 
ponds  to  <o.  (d),  h.  (yJ>  e  A  . 

lid  g 

For  each  member  of  A^  this  correspondence  defines  a  corresponding 

■ember  of  A^.  This  follows  because  every  member  in  A^  is  either  in  D, 

for  which  the  correspondence  is  given  explicitly,  or  it  =  o^(X)  for 

X  e  A^  and  o^  is  defined  and  T(X) ,  in  which  case  the  correspondence  to 

a  member  of  A  is  given  since  o! (X,Y)‘s  existence  just  depends  on  X,  be- 
R  i 

cause  m’(X,Y)  *  m(X),  T*(X) ■-  T(X). 

Conditions  (1)  through  (4)  are  obviously  satisfied  for  this  corres¬ 
pondence  in  the  above  definitions.  Furthermore,  the  function  g(X,Y)  is 
independent  of  Y,  its  second  argument.  This  is  shown  inductively  as 
follows.  Directly  from  the  definition  (2a)  we  see  that  g(X,Y)  is  inde¬ 
pendent  of  Y  when  (X,Y)  is  of  remoteness  0.  Its  being  of  remoteness  0 

is  also  independent  of  Y,  Referring  to  (2b) ,  if  it  is  assumed  that  each 
term  gO^OO  ,h.(Y))  appearing  on  the  right  is  independent  of  its  second 
argument  then  it  follows  certainly  that  g(X,Y)  on  the  left  of  (2b)  is 
independent  of  Y.  If  the  argument  on  the  left  side  of  (2b)  is  of  remote¬ 
ness  n  from  terminal  then  all  the  arguments  of  terms  on  the  right  are  of 
remoteness  <  n  from  terminal.  Thus  the  inductive  argument  is  completed 
concluding  that  g(X,Y)  is  independent  of  Y  if  X  and  thus  if  (X,Y)  is  of 
remoteness  0,1,2,  ...  ,  n. 

Thus  definition  (2)  can  be  rewritten  removing  Y  which  with  f  replacing 
g  is  the  same  as  (1).  Therefore 


-33- 


Lemma  2 . 5  g  and  f  above  are  strongly  equivalent 

Since  the  value  of  g(X,Y)  is  independent  of  Y  it  may  seem  silly  to 
ever  construct  such  a  definition,  with  a  'redundant'  Y,  to  replace  f,  or 
alternatively  that  such  a  redundant  Y  would  arise  inadvertently  in  g  to 
be  removed  by  replacement  with  the  equivalent  f.  The  following  theorem, 
however,  demonstrates  that  such  'redundant'  additions  can  be  of  consider¬ 
able  use. 


Theorem  2.3:  For  any  recursive  definitions  f  in  F  there  is  a  strongly 
equivalent  definition  in  F  which  has  a  uniform  inverse. 


Proof: 


If  f  already  has  a  uniform  inverse  it  serves  as  its  own 
strongly  equivalent  definition.  If  not  the  following  defi¬ 
nition  serves  that  purpose.  Referring  to  Def.  2.1  for  the 
definition  of  f,  ,the  following  function  g  defined  in  terms 
of  the  same  sets,  primitives  and  predicates  is  strongly 
equivalent  to  f.  (p  ■  <Py  ...,  p^>  is  a  vector  which 
records  indices,  and  d  is~the  initial  data  structure.) 


1fc(X,P,d)  ■  q(X)  if  T(X) 

g(X,p,d)  =  w(0l(X),'<l>//p,d),  ....  g(om(x),<m(X>/£,d)  if  T(X) 
initially  <X,p,d>  *  <d,^,d>  with  d  e  D. 

I  is  strongly  equivalent  to  f  by  application  of  lemma  2.5, 
with  Y  of  that  lemma  corresponding  to  {<p,d>|  p  a  sequence  of  integers, 
deD},and  y^  corresponding  to  <n,d>  with  deD.  Furthermore  g  has  a  uniform 
inverse  which  is  given  by  the  following: 


34- 


i0(X,P,d)  =  px 

0"1(X,p,d)  *  <o  (  ...  (o  (o  -(d))  ),  p[p.  -»-x],d> 
p2  pn-l  pn  1 


The  0”*  function  is  quite  complex,  requiring  recreating  a  sequence  of 
data>structures  starting  with  the  initial  data  structure.  In  practice 
one  wants  to  construct  a  strongly  equivalent  definition  which  gives  an 
inverse  but  entails  the  creation  of  an  function  which  is  sinqple. 
Simpler,hopefully,than  that  given  in  the  above  theorem.  This  can  often  be 
done.  If,  for  example. 

Corollary  2.1:  For  a  given  recursive  definition  f  e  F  there  is  no  uni¬ 

form  inverse,  but  each  function  o^  e  p  has  an  inverse  =  °7*t  then  the 
definition  for  g  given  above  with  the  third  component  d  deleted  from  its 
arguments  will  serve  with  the  additional  benefit  that  an  alternative  simpler 


definition  of  o^CX.o}  *  <o7*  (XI  ,o  ro„-*-nl>  can  be  used. 

'  . * . . . . . .  "1q  CXJ — - . 1 — * - 

This  corollary  can  be  applied  to  the  'Tower  of  Hanoi'  definition 
«Xvl.3.  In.  that  example,  e  0  has  an  inverse  for  i  =  1  and  3  but  does  n 
not  quite  have  an  inverse  when  i=  2: 


■  _i  '  .  "  ’  “ 

Oj  («x,y,z>,n>)  *  «x,z,y>, n+l> 

02*(<x,y,z>,n)  =  «x,y,z>,A>  where  A  cannot  be  determined  from  <x,y,z> 

oj*(<x,y,z>,n)  »«z,y,z>,n+l> 


So  first  we  slightly  modify  the  definition  of  f  so  there  will  be  an 
inverse  for  o^^  Lemma  2.5  justifies  this  simple  modification  in  which  a 
component  s  is  added  to  store  the  quantity  A  above  when  i  *  2,  and 
otherwise  to  Temain  equal  to  0. 


*'(«x»y.z>,n>,s)  =  <x,y> 


if  n  ■  1 


\  f*(«x,y,z>,n>,s)  «  f,(«x,z,y>,n-l>,s)^'  f ’(«x,y,z>,l>,n)^ 

|  ^*(«z,y,x>,n-l>,s)  if  n  >  1 

^initially  («x,y,z>,n,>,  s)  ®  («l,2,3>,n>,0  J»  n  c  N 

Now  f  is  equivalent  to  f  in  1.3  and  o^  has  an  inverse  for  i  =  1,2,  or  3  . 
These  inverses  are: 

o“1(«x,y,z>,n>,s)  *  «x,z,y>,n+l>,0> 

02*(«x,y,z>,n>,s)  *  «x,y,  z>,s>,0> 
oI*(«x,y,z>,n>,s)  =  «z,y,x>,n+l>,0> 

Corrollary  2  now  applies  to  f* .  Its  application  yields  g  below. 

(Some  unnecessary  >’s  and  < 's  have  been  dropped.) 

•  /1e(<x,y,z>,n,s,p)  *  <x,y>  if  n  «  1 

g(<x»y.z>,n»s,p)  *  g(<x,z,y>,n-l,s,<l>^p>^  g(<x,y,z>,l,n,<2>/'p) 

^g(<z»y>x>,n-l,s,<3>/^'p)  if  n  >  1 
^initially  «x,y,z>,n,s,p>  =  «123>,n,0,X» 
and  the  uniform  inverse  is  given  by 

i0(<x.y»z>»n»s,p)  »  px 

01(<x,y,z>,n,s,p)  *  <o"1(<x,y,z>,n,s),p[p.^x]> 


•36** 


Implementation  of  f  e  F  with  Associativity  and  Uniform  Inverse 

We  will  first  give  a  way  of  implementing  any  definition  f  in  F  which 
has  a  uniform  inverse  and  is  otherwise  unrestricted.1 2  Then  we  will  give 
a  way  of  implementing  any  f  in  F  which  has  a  uniform  inverse  and  in  which 
w  is  associative.  This  is  done  to  contrast  the  means  necessary  for  imple¬ 
mentation  in  these  two  cases.  In  both  cases  the  implementation  is 
described  by  a  flowchart  containing,  as  usual,  interconnected  assign¬ 
ments  and  decision  statements.  In  both  cases  the  expressions  in  the 
assignment  statements  and  decisions  are  compositions  involving  the 
primitive  functions  and  predicates  w,  o^  e  0,  m,  q  and  T  and  the  inverses 
O'1,  ig  which  enter  the  definition  of  f  e  F.  In  both  cases,  in  addition 
to  the  above  functions  from  the  definition  of  f,  the  repetoire  of  flow  chart  ex¬ 
pression  is  completed  by  an  add  1  function,  a  push  and  pop  and  an  »  predicate. 

In  both  cases  there  is  a  storage  cell  X  which  is  assumed  adequate  to  hold 
any  member  in  A^u  Qf  u  D^. 

In  the  case  that  f  has  both  a  uniform  inverse  and  an  associative  w 
there  is  also  a  storage  list  V  which  can  hold  at  most  any  two 

members  in  W^u  w^.  In  the  case  that  f  only  has  a  uniform  inverse  the  list  V 
is  still  necessary  but  it  cannot  be  bounded  in  size.  It  may  be  required  to  hold 
any  number  of  members  in  u  w^.  The  size  actually  used  will  be  dependent  on 
the  specific  function  f  e  F  realized  as  well  as  the  initial  data-structure. 

In  this  case  an  auxiliary  storage  ARG  is  also  used.  It  holds  at  most  a 
number  of  members  in  *£  ■  to  the  largest  value  in  the 

range  of  a. 


1,  See  fig.  2.1 

2.  See  fig.  2.2 


37 


Flowcharts  1  and  2  which  follow  describe  a  computation  for  each  d  e  D. 
It  is  necessary  to  give  a  concrete  interpretation  of  the  sense  in  which  a 
flowchart  describes  a  computation.  We  imagine  a  traveler  who  starts  by 
entering  block  (0)  of  the  flowchart.  The  traveler  carries  out  the  compu¬ 
tation  described  in  that  block  then, depending  on  the  nature  of  the  block, 
proceeds  to  the  appropriate  next  block.  The  traveler  continues  following 
the  block  instructions  and  proceeding  through  the  flowchart  until  FINI  is 
reached  completing  the  voyage.  The  value  found  in  V  when  the  traveler  has 
completed  the  voyage  is  the  value  confuted  by  the  flowchart. 


Flowcharts:  notation  and  assumptions 

In  these  flowcharts  we  will  use  the  following  notation.  General: 

(e  is  an  expression) 

X  « - e  the  value  of  e  is  assigned  to  X 

V  e  the  value  of  e  is  pushed  into  list  V 

X  <■  — V  the  top  member  of  V  is  popped  and  assigned  to  X 

X  *-pQp —  V[n]  the  top  n  members  of  V  are  popped  and  assigned  to  X 

If  V  is  a  list  =  vn>  then  w(V)  stands  for  the  expression 

«<VV2 . V- 


A 


•38 


Primitives  and  their  Compositions:  (Some  of  the  definitions  are  extended 
to  to  make  the  flowcharts  work  if  the  initial  data-structure  is  terminal.) 


Flow  chart 

dotation  Meaning 


FIRST. KID (X) 
f KIDS (X) 

X  m  TERMINAL? 
PARENT  (X) 
SIB# (X) 

NEXT. SIB (X) 
#SIBS(X) 


o1(X)  if  X  e  Af 

m(X)  if  X  £  4f;  =  1  if  X  e  Qf 

T(X)  if  X  e  Af  o  Qf 

0‘1(X)  if  X  e  Af;  =  X  if  X  e  Qf 

iQ(X)  if  X  e  Af;  =  1  if  X  e  Qf 

°sib#(x5parent(x))  if  X  E  Af 

#KIDS (PARENT (X))  if  X  e  Af;  =  1  if  X  e  Qf 


If  w  is  associative  we  assume  that  there  is  a  member  0W  in  the  range  of 
w  such  that  w(X,Ow)  =  X  for  all  X  in  the  range  of  w.  This  definition  is 
used  in  the  second  flowchart  following. 


flowchart  li 

for  f  c  P  and  f  has  a  uniform  Inverse. 


A.  > 


Flowchart  2i 

For  f  c  F  and  f  has  a  uniform  inverse  and  w  is  associative 


PARENTCX) 


next.siboo 


m 


When  we  say  a  flowchart  implements  or  realizes  an  f  c  F  we  mean 
that  for  each  d  e  D  the  evaluation  of  the  function  f(d)  is  =  to  the  value 


confuted  by  the  flowchart  with  traveler  starting  at  block  ©  and  d  in 
the  flowchart  »  to  d  in  f  (d) . 

We  now  present  proofs  that  the  given  flowcharts,  figures  2.1  and  2.2,  do 
fact  implement  f  e  F  under  the  appropriate  constraints.  The  proofs  are 
very  similar,  both  using  induction  on  the  remoteness  of  the  data- structures 
in  . 


Theorem  2.4J  If  f  e  F  and  f  has  an  uniform  inverse  then  it  is  implemented 
by  flowchart  1.  (figure  2.1) 


Proof:  First  we  wish  to  show  that  if  block  (!)  of  flowchart  (1)  is 

entered  by  the  traveler,  with  storage  cell  X  containing 
data  structure  A,  and  the  list  storage  facility  V  containing 
the  sequence  of  elements  a,  then  the  traveler  will  eventually 
arrive  at  block  ©  with  the  value  of  f(A)  in  X,  and  with 
o^<f(A)>  in  V. 

This  is  done  by  induction  on  the  remoteness  of  A.  It  is 
obvious  that  if  the  remoteness  of  A  is  1  then  with  traveler 
Starting  by  entering  block  ®  of  the  flowchart  with  A  in  X 
and  o  in  V,  the  traveler  will  execute  blocks  ®,  ®,  ® 
then  (5)  because  o^(A)  must  be  of  remoteness  0,  then  ©, 
and  if  1  ®  SIB# (A)  *  #SIBS(A)  then  ©  will  be  next.  After 
©  the  cycle  ®  ©  ®  ©  0  ©  will  be  repeated.  Alto¬ 
gether,  the  cycle  repeats  #SIBS(A)=p  times.  Then  the 
traveler  will  proceed  through  ®  ©0  ©0  an(*  this  ti"*® 


continue  with  ©(7)©  to  (§)•  Tracing  the  change  in  the  content 
Of  X  and  V  during  this  journey ;X  contains  o.+1(A)  after  the  ith 
cycle  of  ©(f)©©©©;  when  the  traveler  reaches  (5) 

X  contains  Op(A),  but  after  (8)  it  contains  A  on  arrival  at 
©.  Simply  tracing  the  blocks  on  travelers  path  shows  that 
V  will  contain  a//<f(A)>  when  traveler  enters  ©.  •  Assume 
that  if  A  is  of  remoteness  <  n;  A  is  in  X,  a  in  V,  and  the 
traveler  starts  by  entering  ©,  the  traveler  will  eventually 
arrive  at  block  (?)  with  f(A)  in  X  and  a//<f(A)>  in  V. 

Consider  then  that  A  is  of  remoteness  n,  A  is  in  X,  a  in 
V,  and  the  traveler  enters  .  Since  X  is  not  TERMINAL,  the 

traveler  goes  to  (5).  As  a  result,  the  traveler  enters  (l) 
again  with  (AJ  in  X,  and  o  still  in  V.  Thus  by  the 
inductive  hypothesis,  the  traveler  will  eventually  enter 
(9)  with  OjCA)  in  X  and  a>/<f(A)>  in  V.  Next  since  o^A) 
cannot  »  d  or  the  uniform  inverse  would  not  exist,  the 
traveler  will  pass  through  (?^  back  to-  (4).  Assumming  that 
1  »  SIB# (X)  *  #SIBS(X)  where  X  *  OjCA),  then  the  traveler 
passes  through  ©  to  ©  where  X  is  made  ■  to  the 
NEXT. SI B (o j (A) )  or  o2fA).  When  now  the  traveler  re-enters 
Q  then  X  is  OjCA)  and  V  is  a//<f (o^(A))>.  Again  the 
traveler  passes  through  ©  and  by  the  inductive  hypotheses 
eventually  to  (?)  with  X  containing  OjCA)  and  V  now  contain¬ 
ing  a^kfCOj (A)) ,f(o2 (A) )>.  This  process  continues  until 
if  p«#SIBS(A)  traveler  arrives  at  ©  with  X  containing 
©  (A)  and  V  containing: 


«^<f(o1(A)),£(o2(A)),  ....  £(op_1(A))> 

Then  by  the  inductive  hypothesis  the  traveler  eventually 
arrives  at  ©  with  X  containing  op(A)  and  V  containing: 

.  a/AfCOjCAJhfCo^A)),  f(op(A))> 

The  traveler  then  passes  through  '©  entering  ©,  and  be¬ 
cause  SIB#(op(A))  =  p  =  #SIBS(op(A))  the  traveler  will  then 
enter©,  then  (?)  as  a  result  of  which  V  now  contains: 

d/ifcwCfCojCA)).  ....  f(op(A)))>  =  oJK  f(A)> 

Then  the  traveler  goes  through  (8)  finally  arriving  at  (5) 
with  V  still  containing  a/kf(A)> 

PARENT (op (A) )  =  A. 

Thus  the  '  First  point  to  be  made  is  proven. 

Now  let  the  traveler  start  by  entering  block  ©,  thus 
setting  V  to  A  and  X  to  d  e  D.  The  traveler  then  enters  (l), 
and  by  the  First  point  made  above,  the  traveler  will  eventually 
arrive  at  ©  with  V  =  A//<f(d)>  -  f(d)  and  X  *  d.  Then  at 
©  the  test  will  succeed  leaving  the  traveler  at  FINI  with 
V-f(d). 

The  proof  covers  the  case  that  d  is  of  remoteness  >  1. 

For  remoteness  of  d  *  0,  a  direct  trace  of  the  flowchart  will 
verify  its  adequacy. 

# 

Theorem  2.5:  If  f  e  F  and  f  has  a  uniform  inverse  and  w  in  the  definition 


Of  f  is  associative  then  f  is  implemented  by  flowchart  2. 
(figure  2.2) 


44- 


Proof:  The  proof  is  very  similar  to  that  of  Theorem  2.2.  The 

difference  is  in  the  value  that  will  be  in  V  when  the 
traveler  reaches  ©. 

First  we  need  to  show  that  if  A  of  remoteness  n,  block  ® 
of  flowchart  2  is  entered  with  A  in  X  and  B  in.V  then 
eventually  the  traveler  arrives  at  ©  with  A  still  in 
X  and  with  V  containing 

w[. . .  w [w [B, f (Oj (A) ) ] , f (o2 (A)  )  ] ,  ....  f(om(A)CA))]  = 
by  associativity 

.  w[B,w[f(0l(A)),  ....  f(om(A)CA))]]  =  w[B,f (A) ] 

Again  we  use  induction.  The  case  when  A  is  of  remoteness  1 
•  is  easily  verifies  by  tracing  the  flowchart  through  the 
.  sequence  of  blocks  <®®  ®  (5)  (6)  (4)  ©>ra(A)-l  times  and 
then  through  ®®®©©@®®. 

Assume  First  is  correct  if  the  remoteness  of  A  is  <  n.  Now 
let  A  be  of  remoteness  n;  X  is  A.V  is  B  and  the  traveler  is 
at  ®.  The  traveler  goes  to  ®  where  X  becomes  FIRST. KID (X)  = 
Oj(A)  and  the  traveler  returns  to  ©•  Since  o^(A)  is  of 
remoteness  <  n,  the  inductive  hypothesis  applies.  Thus  the 
traveler  arrives  at  ©  with  X  being  OjCA)  and  V  *  wfB.fCOjCX))]. 
o^(A)  cannot  be  initial  because  of  the  inverse  so  the  traveler 
goes  next  to  ©.  If  we  assume  now  that 

1  »  SIBf(X)  *  #SIBS(X)  where  X  =  o^(A),  the  traveler  wili  pass 
through  ©  and  ©  updating  X  to  contain  OjCA)  and  then  enter 
By  inductive  hypothesis  again  the  traveler  will  eventually 
arrive  at  ©  with  V  containing: 

wMB.ffo.Wn.fCojCA))] 

and  X  containing  OjCA).  Assuming  without  loss  in  generality 
that  p  ■  »(A).  the  traveler  will  eventually  arrivo  at  ® 


* 


45- 


after  p  repeats  of  the  journey  from  (l)  to  (5)  with  V  contain¬ 
ing: 

"I...  w tw [B j f  (o^ (A) )  ] ,  f (o^CA) ) ] ,  ....  f(op(A))]  * 

w[B,w[f(0l(A)),  ...,  f(op(A))]] 

by  associativity  and  *  w[B,f(A)]  by  definition  of  f(A). 

X  contains  op(A)  at  this  time.  So  the  traveler  goes  to  @ 

where  the  decision  is  yes;  ^  is  next  with  X  becoming 

its  PARENT, i.e.  PARENT(o  (A))  =  A.  Thus  the  traveler 

P 

arrives  at  (9)  again  with  X  containing  A,  V  still  containing 
w[B,f(A)].  Thus  the  First  result  is  proven.  Now  let  the 
traveler  start  by  entering  (§)  thus  setting  X  to  d  and  V 
to  B  ®  0^.  Next  the  traveler  enters  ^  with  these  values 
in  X  and  V  and  so  by  the  First  result  the  traveler  will 
eventually  arrive  at  (9)  with  X  containing  d  and  V  containing 
w[Ow,f(d)]  *  f(d)  by  definition  of  0w- 

As  before  the  proof  is  for  d  e  D  having  remoteness  a  1  and 
is  verified  to  include  remoteness  0  by  tracing  flowchart  2 
explicitly  for  this  case. 

The  necessity  for  a  'uniform  inverse'  as  opposed  to  a  simple  inverse 
in  developing  these  theorems  results  from  the  fact  that  in  the  standard 
form  of  recursive  definition  considered  here  the  number  of  appearances  of 
the  defined  function  symbol  f  is  determined  C=m00)  by  X  the  argument  of  f. 
Ibis  dependence  was  incorporated  so  that  many  common  problems  could  be 
naturally  expressed  in  that  form. 

We  have  not  discussed  the  higher  order  recursive  definitions  having 
nesting  on  the  right  -  largely  because  in  our  experience  such  definitions 
rarely  occurred  in  practice.  Suoh  definitions  are  considered  in  [6] .  The 
techniques  given  in  [<3  in  combination  with  those  here  can  be  used  to  extend 
the  above  results  to  higher  order  rccursivo  definitions  not  covered  in  [6]  . 


In  the  Introduction  the  existence  of  a  time-efficient  implementation 


of  a  function  f  e  F  was  traced  to  the  fact  that  in  the  standard 
evaluation  of  ^  amongst  the  many  sub-computations  necessary,  there  are 
pairs  which  are  virtually  identical.  In  the  implementation  then  it  be¬ 
comes  possible  to  use  the  remembered  result  of  computing  one  member  of 
such  a  pair  in  computing  the  second  member.  Thus  the  time  cost  of  re  computa¬ 
tion  is  minimized.  Examples  illustrating  this  general  assertion  are  given  below. 
Consider  a  function  f  e  F  for  which  the  following  properties  hold. 

(1)  There  is  a  relation  called  dominance  between  some  pairs  of' members 
of  W£  (*»  the  range  of  w  in  the  definition  of  f)  such  that: 

(2)  Whenever  two  members  of  with  one  dominating  the  other  both  appear 
as  arguments  of  a  w  function  in  a  given  order (s),  then  the  dominated 
argument  may  be  removed— (the  non-dominated  one  possibly  requiring  con¬ 
current  simple  alteration)  and.  the  w  function  will  still  give  the  same  result. 

Properties  (1)  and  (2)  alone  are  sometimes  sufficient  to 
allow  significant  time  saving  as  when  w  is  a  logical  ’and'  function 
with  the  arguments  0  or  1.  If  any  argument  of  w  is  known  to  be  0  then 
the  other  arguments  need  not  be  computed.  However,  the  existence  of  fl) 
and  (2)  *is  not  always  sufficient  to  guarantee  a  time-saving,  gut 
if  the  following  property  also  holds,  time  saving  can  be  guaranteed: 

(3)  For  a  substantial  set  of  pairs  x  and  y  such  that  x  =  Og(d),  and  y  =  Oj(d)  where 
t  and  J  are  sequences  applicable  to  the  same  initial  data  structure  d,  it  is 
simple  to  determine  whether  f(x)  dominates  f(y).  (A  pair  of  data  structures 

X  and  y  e  bf  for  which  such  a  determination  is  possible  are  called 
comparable.) 


47- 


Consider  the  following  example  of  the  existence  of  all  three  of  these  above 
properties.  If  w  is  a  minimum,  and  is  the  set  of  positive  integers  then 
wo  can  define  x  dominates  y  to  mean  that  x  is  less  than  y  and  (1)  and  (2) 
will  be  satisfied.  This  in  itself  is  not  enough  to  guarantee  any  time  saving. 
Let  a,  and  8,  be  partial  paths  in  an  integer  weighted  digraph  G  and  let 
f G*,cCa))/f C6,d(8))  be  the  cost  of  a  path  from  node  A  to  node  B  in  starting 
with  the  partial  path  a/8  whose  cost  is  itself  c(a)/c(8).  Then  with  the 
.  current  definition  of  dominance  one  can  determine  whether  f(a,c(8))  dominates 
f(8,c(8))  whenever  a  and  8  end  on  the  same  node.  If  they  do,  f(a,c(a)) 
dominates  f(8,c(8))  if  c(a)  *  c(8)  otherwise  fCf*,c(8))  dominates  f(a,c(a)). 
Thus  C3)  is  satisfied  and  it  is  easy  to  see  that  dominated  functions  need 
'not  be  computed. 

In  fact  whenever  the  three  properties  above  hold,  time-savings  are 
possible.  To  see  this  requires  a  brief  review  of  implementation  techniques 
for  f  e  F.  The  standard  depth  first  or  breadth  first  implementations  of  a 
recursive  definition  f  e  F  is  a  simulation  of  the  substitution  process 
described  in  Section  2.  Initially  the  substitution  starts  with  the  'evalua¬ 
tion  form*  f(d),  d  e  D^.  If  T(d)  is  not  true  this  is  replaced  by  the  ’eval¬ 
uation  form'  w(f ^(d),  ...  ,  W)) •  Then  substitution  is  made  for 

fCOj(d))  for  some  Is  j  s  m(d)  is  made  to  get  a  next  'evaluation  form'. 

Tlie  process,  continues  with  subsequent  substitutions . for  occurrences  of  the 
form  f(a).  Now.  if  the  above  three  conditions  hold  one  can  include  in  the 
evaluation  an  examination  to  determine  whether  two  appearances  of  f,  say 
f(o)  and  f(8)  appearing  in  a  subexpression,  like  w(f(a),  ...  ,f(8),  ...  ), 


of  on  evaluation  form  are  comparable.  If  they  are  and  if  f(o)  dominates 
then  f($)  can  be  eliminated  from  the  subexpression.  An  entire  course 
of  substitutions  is  thus  eliminated. 

Mien  the  three  conditions  above  hold,  it  is  of  advantage  then  to  in¬ 
corporate  in  the  implementation  a  means  for  comparing  pairs  of  data-structures 
a,  and  &  in  A^.  The  details  on  how  this  is  done  will  depend  on  the  details 
of  the  definition  but  two  broad  classes  can  be  distinguished.  For  a  defini¬ 
tion  f  c  F  in  which  comparable  data-structures  arise,  the  pattern  and  frequency 
of  their  occurrences  may  be  highly  dependent  on  the  initial  data-structure, 
or  alternately  their  occurrences  may  follow  a  fixed,  predictable  pattern 
largely  independent  of  the  input.  In  the  input  dependent  case  a  facility 
for  testing  for  comparability  can  be  incorporated.  It  must  be  able  to  handle 
comparisons  in  a  general  way.  Many  partial  results  will  have  to  be  saved 
for  comparison,  even  though  the  benefit  derived  from  the  comparison  may  be 
small.  In  the  patterned  or  systematic  case  the  implementing  algorithm  can 
often  be  tailored  to  take  advantage  of  this  fixed  pattern — avoiding  the  need 
for  a  general  comparison  facility. 

In  this  section,  a  significant  subclass  of  functions  in  F  which  have 

a  patterned  structure  of  comparable  data-structures  will  be  studied.  This 

subclass  is  called  the  'explicit  history*  class.  Corresponding  to  each 

-■ember  of  this  class  is  a  set  of  equations  whose  solution  is  equivalent  to 

the  evaluation  of  the  corresponding  function?  If  this  set_  of  equations 

has  the  property  of  'open-loop  consistency*  it  can  be  solved  by  a  process 

similar  to  Gaussian  Elimination.  This  in  turn  will  inracdiately  provide  a 

*  Problems  whose  solution  can  be  obtained  by  effectively  solving  a  set  of 
equations  with  'linear'  like  properties  form  a  significant  class.  Such 
a  class  is  carefully  considered  in  [1]  .  Here  we  are  interested  in  how 
recursive  definition  formulations  to  these  and  even  some  'non-linear' 
problems  arc  related  to  their  set  of  equations  formulation. 


19 


relatively  efficient  algorithm  for  implementation  of  such  functions  in  F. 

The  definition  of  'open-loop  consistency',  a  property  of  equation  sets, 
will  be  developed  first  then  that  of  the  explicit  history  function,  and  then 
In  Theorem  3.1  their  relation  will  be  established. 

Throughout  the  subsequent  development  we  will  make  extensive  use  of 
notation  similar  to  that  introduced  in’  section  2  for  indicating  replacements 
of  conqponents  of  a  vector.  Here  we  will  extend  that  notation  to  describe 
replacement  of  a  component  of  any  expression.  So  if  e  is  an  expression, 

Xj  a  variable  which  may  appear  in  e  and  e^,  another  expression  then 
•IV- «j]  means  the  result  of  replacing  each . occurrence  of  X2  in  the  ex¬ 
pression  e  by  the  expression  e^.  The  notation  can  be  further  extended  to 
cover  sets  of  such  substitutions ; for  example,  e[X^  e^;k  e  N]  is  the 
result  of  substituting  the  expression  e^  for  all  occurrences  of  Xj,  e2 
for  all  occurrences  of  Xj,  ...  ,  and  en  for  all  occurrences  of  Xn  in  e. 

Also  the  notation  allows  composition;  so  e[X2  ■**  e^CX^  0  3  is  the  result 
of  first  replacing  each  X2in  e  by  e^  and  then  every  in  the  result  by  0. 
Note  that  the  expression  e  X2  e^[X^  -+-()]]  [X^  ■*-  0]  as  well  as  the  expression 
eCXj  0]CX2  e^CXj  033  gives  the  same  result  as  that  of  the  notation  in 
the  previous  sentence.  Such  reorderings  yielding  the  same  result  will  be 
used  in  the  proof  of  Theorem  3.1. 


cn-Loop  Consistency  and  Explicit  History  Definitions 


Equation  Sets 


i*  •  set  of  functions.  If  w  e  Wn  then  w  is  defined  on  0  arguments 
1  argument,  ...  ,  n  arguments.  Each  argument  is  drawn  from  a  set  S. 


The  range  of  each  w  is  also  S.  S  contains  a  0  element  with  the  pro¬ 
perty  that  w(X^,  ...  »xn)  *  ...  »^c-l*^c+l*  ***  »^n^ 

if  xc  -  0, 

Ea(Wn)  is  a  set  of  equations  involving  n  functions  from  *  (w^,  ,,,  »wn^* 
and  the  variables  X^,  ...  ,Xn< 

E_(W_)  contains  a  subset  called  the  basis  subset. 

n  n  — ..  — 

The  basis  of  E  (W )  «  {X.  ■  w. (X .  ,  ...  ,X.  )|j  cH,  j.  cN,  jv>  j,  if  i  >  k 

n  n  j  j  1  k  i 


In  addition.  E  (B  )  satisfies  the  closure  conditions  that: 
n  n 

Cl:  If  X.  *  e  is  in  £  (W  )  then  so  is  X.  ■  eCX,  03  for  any  X.  in  e. 
j  n  n  j  k  k 


C2:  If  for  any  given  j  and  k  e  N,X^  »  and  \  are  eacb  members 
of  E^,  then  so  is  X^  *  "*  e2^* 

En(Wn)  consists  only  of  those  equations  in  the  basis  together  with  those 

contructable  from  Cl  and  C2. 

Prom  here,  on  we  use  E  to  stand  for  E  fK) . 

n  n  n' 


Open-Loop  Consistent  Equation  Sets 

Let  Q  *  {Xj  ■  e^|j  e  N}  be  a  set  of  equations  in  En»  • 

Let  Qc  -  {X^  *  e j  J 1  <  j  s  c-1}  U  (Xc  •  ecCXc  *  03)  U  [X.  »  e^c+1  <  j  <n) 

If  every  solution  to  q  is  also  a  solution  to  ,  then  Q  in  En 
is  open-loop  consistent  in  Xc»  If  Q  of  En  is  open-loop  consistent  in  all 
variables  Xc,  c  c  N  then  Q  is  open-loop  consistent  in  En<  If  every  subset 
in  Eft  of  the  form  of  q  is  open-loop  consistent,  then  E^  is  open-loop  consistent. 


The  significance  of  open-loop  consistency  follows  from  the  following. 

On  the  one  hant3,  if  the  basis  set  of  E  is: 

n 

<Xi-eiJicN> 

and  Er  is  open-loop  consistent  then  a  process  analogous  to  Gaussian  Elimin¬ 
ation  will  be  adequate  to  solve  the  basis  equations  for  0L|i  e  N}.  This 
process  is  based  on  the  alternate  application  of  two  operations  applied  to 
a  set  of  equations  which  is  initially  the  ;>asis  of  E^,  The  first  operation 
is  used  to  remove  the  recursive  appearance  in  an  equation  of  the  form 
Xj  *  6^  of  any  occurrence  of  in  e . .  This  is  made  possible  if  the  set  of 

equations  is  Open-Loop  Consistent,  by  setting  all  occurrences  of  in  e^ 
to  0.  (Open-loop  consistency  is  a  specialization  of  a  considerably  more 
general  property  which  would  allow  removal  of  such  recursive  appearances, 
which  we  hope  to  develop  at  a  future  tine.)  This  first  operation  is  only 
necessary  if  there  is  such  a  recursive  appearance  to  be  removed  in  one  or 
■ore  equations  of  the  set.  The  second  operation  uses  an  equation  of  the 
fora  Xj  *  ej  with  X^  appearrances  removed  from  ej  to 

substitute  for  all  appearrances  of  X^  on  the  right  of  other  equations  in 
the  set,  thus  eliminating  all  occurrences  of  X^  on  the  right  of  all  equations 
in  the  set.  The  two  operations  can  be  repeated  n  times  as  needed  to  produce 
a  solution  to  the  basis  set  from  En<  Note  that  every  equation  in  the  sets 
that  result  from  different  steps  in  this  process  is  in  E^  so  that  the 
open-loop  consistency  condition  is  always  applicable.  In  any  case  this  pro¬ 
cess  provides  a  relatively  efficient  means  of  solving  such  a  set  of  equations 
On  the  other  hand  the  evaluation  of  an  'explicit  history'  recursive 
definition  will  be  shown  to  be  equivalent  to  the  solution  of  a  corresponding 


52 


set  of  equations  forming  the  basis  of  a  set  Hn  if  En  is  open-loop  con 
sistent.  Thus  the  relatively  efficient  solution  Hill  be  applicable  to 
*explicit  history'  definitions  having  this  property. 


Explicit  History  Recursive  Definitions 


The  ’explicit  history'  recursive  definition  will  initially  be  given  in 
s  form  which  allows  the  simple  description  of  the  corresponding  set  of  equa¬ 
tions.  Later  we  will  show  other  equivalent  forms  for  such  definitions. 
Recalling  that  N  *  {  1,2,  ...  ,n},  let  H  be  a  vector  all  of  whose 

components  are  in  N,  and  no  two  of  which  are  the  same.  H  is  the  history 
vector.  W/c  will  be  used  as  a  shorthand  for  H^<c>  «*  the  concatenation 
of  <c>  with  vector  H.  If  V  is  a  vector,  {  V}  is  the  set  of  components  of 


V. 


w.  c  set  factions  defined  earlier,  c.  e  N.  c,  ?■  c.  if  i  > 
An  ’explicit  history'  recursive  definition  has  the  form: 


def  3.1 


^ . 

c  c  C1  m 

initially  H  *  X,  c  e  N 


if  c  c  {  H} 
if  c  e  N-  {  H} 


If  f  is  an  explicit  history  recursive  definition  then  the  correspond¬ 
ing  set  of  equations  designated  E^Cf)  is  built  on  the  following  basis: 

def  3.1.1  {  X  -  w  (X  ,  ...  ,X„  )  -  e  |c  e  N) 

v  C  C«  C_^  C 

1  a 


Relation  of  Open- loop  Consistent  Equation  Sets  and  Explicit  History  Definitions 


Theorem 


Proof: 


3.1:  If  f  is  an  explicit  history  recursive  definition  and  E^ff) 
Its  corresponding  equation  set  is  open-loop  consistent  then 
the  set  of  values  {  X*lc  e  N)  as  determined  by  evaluating  f. 
will  satisfy  the  basis  of  E^ff)  with  =  X*  for  all  c  e  N. 


A  partly  inductive  argument  will  be  used. 

First  we  define  two  kinds  of  expressions  Z?  and  Y?. 

If  N-  {a}  *  fc}: 

di)  Z?  -  e.  ■  the  right  side  of  the  equation  X.  s  e.  in 
If  i  c  a: 

dP  Zj“^° 

If  j  e  N- {a} : 

dj)  zj  *  an  expression  giving  as  a  composition  of  functions 

in  Wn  in  which  only  the  variable  with  k  e  {a}  appear 

d2)  5  z“Cxk  *■*  0:k  e  { a >] 

zj  then  gives  X.  as  an  expression  involving  nc  variables  and  thus 
MS  a  composition  of  functions  from  Wn  with  no  arguments ,  w..  O 

Bust  be  defined  as  a  constant  and  so  the  expression  Z^  must  be 
evaluated  as  a  constant.  By  definition  of  Y?: 

y*  ■  2* 

j  J 

Note  that  by  definition  when  N- {a}  »  f j )  the  equation  X^  =  z“  is 

in  En,  thus  since  E^  is  open-loop  consistent  so  is  X^  =  Y?(by  d^  above) 


00  Assume  that  for  a  =  any  sequence  of  k  <  n  integers  taken  from 
N  with  no  two  equal  and  c  c  N-  fa}  that: 

-  Z“//<C>  jjeN-fa//<e>} 


are  each  true  equations  in  E  * 


Then  for  c  e  N: 

Xc  ■  ec  is  in  En  by  def  3.1.1 

Xc  ■  *cC*j  Z®^“:  jcN-  [a//<c>}J  is  in  En  by  definition  of 

En  and  (H) 

X  ■  e  rx.  *■  :jcN-  (a//<c>}]  [X  03  is  in  E  because  E  is 

®  c  j  j  c  n  n 

open-loop  consistent  andX-  *  2®,,<c>  is  in  E^  by  (H). 
X„  -  e  rx.  -  Z®//<C>(X  -  03:jeN-  ia//fc>}3  Cx  «■  0]  is  in  E  by 

reordering  and  substitution. 

Xc  ■  ecCXj  ♦  z“^<c>[Xc  0j:jeN-(a}3  is  in  En  by  definition 

by  which  z  ?  ^  <c>  *  o  when  j  c  {<*//<£>} . 
So  if  Z?j  j  e  N-{a}  is  defined  as: 

Z®  e  e  [X.  -  zf^[Xc  ♦  0]:jcN-{a)3EX  0] 

C  C  J  J  ^  C 

then  X„  *  Za  is  true  and  is  in  E_  for  c  e  N-{o}  and  by  definition 
c  c  n 

of  Y®  and  d_  )  X®  *  Y°  for  c  e  N-  fa} .  In  fact  we  have  that : 

c  z  c  c 

by  dj 

Y®  «  e  [X.  *♦*  Z?^<C>[X  Oj: jcN- {a)Xxv  +"  0:ke[o}]by  reordering 

v  C  J  J  C  K 

yJ|  *  eciX.  «*  z“//<C>[Xk  0:ke{a//<c>}3:jeN-(u}] 

(1)  Y^  «  ecCXj  -  Y^/<c>:jeN-{a}] 
or  since  by  d^: 

K*°  “««■ 

1  Y^  ■  ec[X^  Y®//<c>:jeN]  if  c  c  H-a 
These  two  relations  give  the  recursive  definition  we  seek. 


The  recursive  definition,  def  3.1,  is  not,  as  written,  in  the  form 
required  for  a  member  of  F.  However,  by  simple  transformations  a  definition 


fSSr. 


which  is  a  member  of  F  can  be  generated. 

Let  f(H,c)  *  and  make  the  subscript  of  w  an  argument  of  a  modified 


tr  function. 


fCH,c)  -  0 


if  c  c  {H} 


(f(H,c)  *  w(c,f(H//c,c^>,  ...  ,f(H//c,c))  if. c  c  N-{H} 

initially  H  =  X,  c  e  N 

Still  one  more  small  change  is  needed  to  create  a  definition  in  F.  The 
first  argument  c  of  w  is  not  in  the  required  form.  It  is  put  in  the 
correct  form  by  adding  another  argument,  s,  which  is  either  0  or  1. 


rf(H,c,0)  =  0  if  c  c  H 

f(H,c,s)  *  c  if  s  -  1 

# 

def  3.1b  /  f(H,c,s)  »  w(f(H,c,l)ff(H//c,clf0),  ...  ,f CH//c,cn,0)) 


if  C  t  N-fH} 


initially  H  =  A,  c  e  N,  s  *  0 


'  Theorem  3,1  is  the  Main  Result  of  this  section.  Its  application  will 
be  illustrated  by  an  example.  In  order,  however,  to  make,  this  result  generally 
applicable,  it  would  be  necessary  to  further  elucidate  two  questions  concerning 
the  set  of  equations  which  correspond  to  a  given  ’explicit  history’  definition. 


-56- 


The  first  of  these  is:  do  sets  of  equations  which  are  open-loop  consistent 
•rise  with  significant  frequency  in  practice.  The  second  question  arises 
after  one  has  an  open-loop  consistent  set  at  which'point  one  would  like  to 
know  if  the  solution  is  unique.  This  would  guarantee  that  evaluation  of 
the  recursive  definition  would  give  the  same  result  as  would  any  procedure 
for  solving  the  corresponding  equation  set.  If  the  solution  is  not  unique 
it  would  be  well  to  know  to  which  solutions  of  the  equation  set,  obtained 
by  what  procedure,  the  evaluation  of  the  corresponding  recursive  definition 
(which  is  always  unique)  is  equivalent.  These  two  questions  are  not  considered 
further  here.  Now,,  however,  we  note  some  very  special 

cases  in  which  the  answers  to  these  questions  are  evident. 

We  say  E  has  no  loop  if  the  equations  in  the  basis  of  En: 

■  \ 

have  the  property  that  contains  no  occurrence  of  any  variable  ki  0. 

If  this  is  the  case  then  it  is  easy  to  see  that  no  equation  in  En  can  be 
recursive  -  i.e.  have  the  variable  appearring  on  the  left  of  the  equation 
also  occur  on  the  right.  From  this  it  follows  directly  that  E^  is  open- 
loop  consistent.  Furthermore,  it  is  easy  to  show  that  in  this  case  the 
solutions  are  unique. 

lamina  3^1;  If  the  basis  of  E^  has  no  loops^thert-  'tS^Nipert^loon  -consistent 
and  its  solution  is  unique. 

Examples  of  Applications  of  Theorem  5.1 

The  maximum  and  minimum  path  problems,  each  defined  on  a  directed 
graph  will  be  used  to  illustrate  the  application  of  Theorem  3.1, 


-57- 


tr 


Hie  following  graph  definition  will  be  used: 

G  is  a  directed  graph  with  a  set  of  nodes  N  ■  [1,2,  .1.  ,n} 

If  c  is  a  node  in  G  then:  - 

is  the  i^  neighbor  of  c(the  i**1  node  reachable  from  c  by 
traversing  a  single  directed  branch 
is  the  last  neighbor  of  c 

W(Cj)  is  the  weight  of  the  branch  from  c  to  c^.  It  is  always  >  0. 
If  is  a  number  then: 

■ax(Xj,  ...  ,Xn)  «  the  maximum  of  X^,.i  c  N 
jpinQCj,  ...  ,Xr)  *  the  minimum  of  X^,  i  e  N 
A  path  h  *  <h^,  ...  ,h^>  in  G  is  a  sequence  of  nodes  in  G  such 
that  there  is  a  branch  <h^,hj+j>  in  G  for  each  p-1  i  i  >1.  The 
weight  of  path  i  is  the  sum  of  the  weights  of  the  branches 

<VW  V  >5i  *1- 

Interpret  MAW(h.c)*  to  be  *  the  maximum  weight  loopless  path  starting 
with  the  path  h//<c>  and  ending  on  node  t.  Since  continuation  of  such  a 
path  must  be  by  passage  to  some  neighbor,  c  ,  of  c  which  cannot  be  in  h//<c> 

^if  there  are  to  be  no  loops  we  have : 

Ex.  3.1  f”MAW(h,c)  *  0  if  c  c  (h)  or  c=t 

j  MAW(h , c)  -  max (W (c^  +MAW (h//c .Cj),  ..,  ,W(cn)  +MAW (h//c , cm) 

if  c  c  N- f h } 

initially  h  «  X,c  e  N  *  the  first  node  in  path 
\* 

and  analogously  for  the  minimum  path  we  have  if  MlW(h,c)  is  the  minimum 
weight  of  all  paths  starting  with  path  k  W/< c>  and  ending  on  node  t.  then  its 
recursive  definition  can  be  given  by  the  equations: 

•Strictly  this  »S  -MAWt(h,c)  since  the  path’s  end  on  t. 


-58-  ■ 


Ex  3.2 


MIW(h,c)  -  0 


if  c  e  {h)  or  c  =  t 


MIW(h,c)  -  min(WCc1)*MIW(h^c  ,Cj),  ...  ,WCcB)+MIW(h//c  ,c  )) 

<  ® 

if  c  c  N-fh) 

^initially  h  »  X,c  e  N 

With  w(c,Xlt  ...  ,Xn)  ■  nax/minCWCc^+Xj,  ...  ,W(ca)+Xm),  both  examples 

•re  in  the  form  of  definition  3.1a  and  thus  are  equivalent  to  the  recursive 

definition  in  theorem  3.1.  Therefore,  in  both  cases  we  may  speak  of  the 

corresponding  set  of  equations  def  3.1.1.  Consider  the  equations  in  the  basis 

of  :Er(MAW)  corresponding  to  ex  3.1  first: 

Ex  3.1.1  {Xc  *  maxCtfCc^-^  ,W(c2)+Xc  ,  ...  ,VCcffl)+Xc  )|ceN} 

*vl  2  m 

where  as  noted  is  the  i  neighbor  of  node  c  in  G.  Now  in  general 

because  of  the  max  function,  EnQ4AW)  with  the  above  basis  will  not  be 

open-loop  consistent.  In  fact  if  the  basis  of  En(MAW)  contains  an  equation 

in  which  the  same  variable  appears  on  the  right  then  E^CMAW)  is  not  open- 

loop  consistent  because  suppose  c.  »  c,  a  >0  and: 

3  ci 

(a)  Xc  -  max(ac  +Xc  ,  ...  ,ac  +Xc  ,  ...  ,«c  +Xc  ) 

11  j  j  mm 

then  if  E  C*'AW)  was  open- loop  consistent  X  would  also  satisfy: 
il  c 

0*)  X  *  max  (a  +X  ,  ...  ,a  +X  ,a  +X  ,  ...  ,a  +X  ) 

c  ci  ci  cj-i  cj-i  cj-i  cj*i  c*  c« 

but  it  is  easy  to  see  by  substitution  for  X  *  X  using  Q0  in  (a)  that 

j 

this  cannot  be. 

However,  if  the  graph  G  does  not  have  any  loops,  then  En(MAW)  is 
open-loop  consistent,  since  then  the  conditions  of  lemma  3.1  are  met.  So 
if  G  is  loopless,  theorem  3.1  can  be  applied  and  solutions  to  the  equations 
of  ex  3. 1.1  will  be  equivalent  to  an  evaluation  of  the  recursive  definition  in 
CX  3,1,  The  number  of  equations  in  ex  3,1,1  is  equal  to  the  number  of  nodes  in  the 


fraph  G.  No  variable  appears  on  both  the  left  and  right  sides  of  an 
equation  and  cannot  do  so  as  the  result  of  any  substitutions 


because  G  has  no  loops.  One  of  the  equations  in  the  set,  say 

*  e  .  will  have  only  a  constant  on  its  right  side,  i.e.  e_  «  a  .  This 
n  n  n  n 

constant  may  be  substituted  for  *11  occurrences  of  XR  on  the  right  of  the 
other  equations.  As  a  result,  a  second  equation,  say  xn  j  “  en  jCxn  an3 
will  have  only  a  constant  on  its  right.  This  process  continues  until  all 
resultant  equations  up  to  the  one  with  the  variable  whose  value  is  sought 
have  only  a  constant  on  their  right.  The  i**1  substitution  involves  sub¬ 
stituting  for  at  most  i  variables,  i  additions  and  taking  the  maximum  over 
i  numbers,  or  the  order  of  i  steps,  i  ranges  from  1  to  the  number  of 


equations  at  most.  So  after  the  equations  are  set  up,  the  solution 
algorithm  is  has  a  complexity  of  the  order  of  n2,' 


In  ex  3,2,  the  set  of  equations  EnCMIW)  is  open-loop  consistent. 


is  easily  seen  for  if  c,  «  c,  a_  > 

J 


0  and: 


W  Xp  *  min Ca^  +X^ ,  , a^  +X^  ,  ,,,  ,ac  +X^  ) 

1  i)  mm 

and  consider: 


(b)  X  -  min (a  +X  ,  ...  ,a_  +X  ,a  +X  ,  ...  ,a  +X  ) 

■  »  i  .m  *  ” 


substituting  with  (b)  in  the  right  of  (a)  we  get: 


X  -  minCa  +X  ,  ...  ,min(a  +X  ,  ,a  +X  ,a  +X  , 

C  C1  C1  C1  C1  cj-l  .  cj*l  cj+l  cj+l 

e  «  •  |S_  ^ J  j  §  21  _  J 

c  c  c  c 

mm  mm 

ahich  because  of  the  properties  of  min: 

Xc  »  minCa  +XC,  ,,,  ,a  +X  ,a  *Xc.  •  »ac  +Xc J 
1  j*l  jrl  j+1  j+1  m  m 

The  same  equation  would  result  from  substituting  for  X£  on  the  left  of  (a) 
with  the  right  of  (b).  Thus  (b)  satisfies  (a)  verifying  the  claimed 


open-loop  consistency.  Not  only  is  E^CMIW)  open-loop  consistent,  but  the 
basis  of  En(MIW)  can  be  shown  to  have  a  unique  solution. 

Ilius  a  substituional  solution  to  this  set  of  equations  will  give  a 
solution  to  the  corresponding  recursive  definition.  This  solution  has  a 
tine  complexity  £  n^,  in  part  because  after  each  substitution  step  the  terms 
Can  be  gathered  in  this  case  so  they  will  again  be  a  simple-min  of  constants 
♦  *ed  with  variables,  in  the  same  form  as  the  initial  basis  equations.  Fur¬ 
thermore  in  this  case,  again  -  primarily  because  of  the  properties  of  the 

nin  function,  the  complexity  of  the  gaussian  elimination  solution  can  be  re- 
2 

duced  to  order  n  by  adopting  the  appropriate  order  of  substitution.  The 
appropriate  order  is  given  by  the  following  considerations. 

First  the  equation  will  be: 

■  0  ,  because  t  is  by  definition  the  last  node  on  a  path 
The  right  side  is  a  constant.  So  substitute  throughout  the  other  equations 
right  sides  this  constant  value  for  Xt.  In  these  other  n-1  equations  carry 
out  the  ♦  and  min  operations  with  the  substituted  value.  That  will  leave 
the  right  side  of  these  equations  again  in  the  form  of  a  min  of  terms.  All 
except  one  of  these  terms  being  a  constant  +  *ed  with  a  variable.  The 
exceptional  term  being  simply  a  constant.  The  variable  X^  will  not  appear 
on  the  right  side  of  a  Ay  equation.  The  key  step  now  is  to  examine  the  n-1 
equations  for  the  one  whose  constant  term  is  the  smallest . 

Say  this  is: 

Xj  *  »in(c.  ^  X,  |  «•«  |C*  ^  X. 

'  3i  J1  Jn  ■'n  J 

where  c.  and  d.  are  constants.  Then  that  equation  can  be  simply*  replaced 


That  is,  all  the  variables  can  be  removed  from  the  right  side  of  this 
equation.  Roughly  this  is  justified  because,  since  d^  is  the  smallest  constant 
in  the  set  of  equations  and  consideration  of  the  substitutional  method  of 
solution  indicates  that  no  variable  can  ultimately  be  assigned  a  lower  value. 
Next  in  the  n-2  equations  (excepting  the  Xj  and  X^  equations  which  are  solved) 
dj  is  substituted  for  all.  appearances  of  X j .  Terms  are  then  gathered  in  the 
remaining  n-2  equations.  Again  the  one  of  those  having  the  smallest  constant 
is  chosen  and  all  its  variable  terms  removed,  -  and  so  forth  thus  providing 
the  solution  for  another  variable.  The  process  is  repeated  until  the 
variable  whose  value  is  sought  becomes  the  one  having  the  smallest  constants 
on  the  right,  of  all  remaining  unsolved  equations.  That  is  the  value  of 
the  variable. 

In  this  method  of  solution  one  is  always  substituting  a  constant  thus 

greatly  reducing  the  effort  necessary  to  simplify  the  right  sides  of 

equations.  It  is  essentially  Dykstra’s  algorithm  and  its  complexity  of 
2 

order  n  . 


Some  Simple  Equivalences 


Throughout  this  paper  we  have  presented  recursive  definitions  purported 
to  be  ’natural*. .  In  fact,  however,  for  any  given  algorithm  design  problem 
one  may  develop  a  number  of  recursive  definitions  each  being  equally  enr- 
titled  to  being  called  natural.  The  differences  in  these  equally  natural 


descriptions  may  range  from  very  small  details  to  deeper  differences 
representing  radically  different  points  of  view.  We  have  seen  some  small 

adjustments  of  detail  in  getting  equivalent  formulations  of  the  recursive 

definition  of  theorem  3,1.  Jjeeper  differences  are  represented  by  the 


alternate  formulations  for  the  recursive  definition  of  the  set  of  binary 
numbers  given  in  the  introduction.  The  path  examples  of  the  previous  pages 
were  designed  to  look  natural  -  but  also  to  fit  the  theorem  formulation 
-  many  other  formulations  -  which  did  not  quite  fit  the  theorem  formul¬ 
ation  could  have  been  represented  as  natural.  There  is  no  deception  here; 
they  are  all  natural.  There  are  many  equivalent  ways  of  defining  the  same 
function.  A  number  of  simple  equivalences  will  now  be  developed  which 
account  for  some  common  alternatives.  Possession  of  such  equivalences  allow 
one  to  move  easily  from  definition  to  definition.  As  we  have  seen,  some 
definitions  are  closer  to  good  algorithms  than  others.  Some  definitions, 
unary  ones,  can  even  be  considered  as  specifying  algorithms.  In  fact, 
all  the  theorems  given  thus  far  can  be  viewed  as  equivalences  which  allow 
one  to  move  from  an  initial  definition  to  a  better  one,  The  equivalences 
given  now  differ  only  in  being  somewhat  simpler  than  those  developed  pre¬ 
viously,  and  seeming  to  have  more  to  do  with  initial  formulation. 

The  equivalences  to  be  given  all  involve  trade-offs  between 

the  complexity  of  the  w  and  o-functions,  or  between  the  complexity  within 
the  argument  of  the  function  being  defined (say  f)  and  the  complexity  outside 
that  argument.  If  the  structure  of  the  argument  of  f  is  considered  the 
data-structure,  and  the  way  in  which  f  enters  as  an  argument  (of  w)  in  the 
recursive  definition  is  called  the  control  structure,  then  these  equivalences 
give  trade-offs  between  data  and  control  structures. 

The  significance  of  the  following  two  theorems  can  probably  be  better 
appreciated  if  the  example  following  the  theorems  are  scanned  before  the 
theorems  and  proofs  are  read. 


Theorem  3.2:  If  operations  •  and  ♦  have  the  properties 


_ *i-«  (X=  :  X3)  «  CX1  :  x:i  .  x3 

f  — 

g2— -x.  -  Ci"alxiLu"cl_(x  .  jg 

end  if: 

V  r«(X)  -  q(X)  if  T(X) 

jg(X)  -  xjfjp  (t.(X)  «  gCo.CX)))  if  T(XJ 
initially  X  e  D 


L 


and: 

V1 


ff(y#X)  «  y  •  q(X)  if  T(X) 

f(y,X)  «  rjfj3  f(y  *  t.CX),o.(X))  if  T(X) 
initially  X  e  D- 


then  for  each  X  e  A  : 
- g- 

f(y>X)  «  y  •  g(X) 


hroof:  We  simply  show  that  y  *  g(X^with  g(X)  as  defined  in  dj  satisfies 

or  makes  true  the  relations  of  d2  when  appropriately  substituted 
for  f(y,X)  in  d2. 

Thus  if  X  is  of  remoteness  0,  d2*-%  defines  f  (y,X)  so  that  sub¬ 
stituting  : 


r  *  8(X)  -  y  *  q(X) 

"f  *  qCX)  ■  y  •  q(X)  since  g(X)  *  q(X)  when  X  is  of  remote 

ness  0  by  -dj 

For  X  having  remoteness  >  0,  d^-2  defines  f (y  .  X) .  Substitute 
y  '  eOQ  for  f (y  •  X)  throughout  that  equation  to  determine  if  it 
is  thus  satisfied.  Note  that  if  X  e  A^  then  certainly  o^X) 
is  also  a  member  of  A  , 


•64- 


y  •  g(X)  -  rjfJ5C(y  •  t.(x))  •  gCo^x));] 

■«!>  •  (t.(X)  .  g(o.(X)))3  by  pi 

-  y  •  (ijfpCtiOQ  *  gCoi00)3  by  p2 

r  •  *00  ■  y  •  ecx)  by  ^ 

Out  first  example  of  the  application  of  theorem  3.2  requires  some 
definitions.  Let  B  *  {bI#  ...  ,bB}  and  A  *  (a1#  ...  ,an>,  both  be  sets 
whose  components  are  vectors: 

*  (bj^a^,  ...  ibj//4jj}  u  ^2*  *bmJ(2)A 

With  this  definition  of  Q)  the  following  property  clearly  holds  for  A,B 
and  C  each  sets  of  vectors: 

*  C®(B®A)  *  (c®b))® a. 

Also  if  u  is  the  usual  union  operation  then  if  each  X^,  i  e  N,  as  well  as 
A  is  a  set  of  vectors: 

p2  x©(x2  u  x2  o  ...  u  xn)  -  (x®xp  u  CX®X2)  u  ...  u  (x®xn) 

Now  the  set  of  all  n  bit  binary  numbers  (vectors)  *  B(n)  is  the  set  of 
all  n-1  bit  binary  numbers,  B(n-l),  with  a  0  attached  in  front  of  each 
Member  of  the  set,  (<0>>®  B(n-l) ,  together  with  00  B(n“l)  with  a  1 
attached  in  front  of  each  member  of  the  set,  {<1>}®  B(n-1) . 

More  formally  the  definition  is: 

B(n)  -  (X)  if  n  -  0 

B(n)  «  ( (<0>) ® B(n-l))  u  (f<l>} ® BCn-l))  if  n  >  0 
initially  n  «  the  number  of  bits  of  the  binary  number 

Since® and  U  satisfy  pi  and  p2,  it  follows  by  theorem  3,2  that 


B'(y,n)  given  in  the  following  definition  is  a  member  of  Fand  is  related 
to.  the  definition  above  in  that  B»(y,n)  *  y(|)B(n). 

rB'(y,n)  =  y  (X>  ■  y  if  n  «  0 

(2)  B'(y,n)  *  B' (y Q)  (<0>},n-l)  u  B»(y Q)  (<l>},n-l3)  if  n  >  0 

initially  n  =  the  number  of  bits  of  the  binary  number,  y  =  (X) 

Note  that  (2)  is  essentially  the  definition  given  in  the  Introduction 
for  binary  numbers  (ex  1.1). 

As  a  second  example  consider  the  application  of  this  theorem  to  the  previous 
path  examples.  Thus  equivalent  to  the  definition  of  ex  3.1  we  have  by  theorem 
3t2  since  max  (corresponding  to  +)  and  ♦  (corresponding  to  «.}  have  the 
appropriate  properties  the  following  function  MAWfcthe  pair  ^h,c>  corresponds 
to  X,  W(Cj)  corresponds  to  t^X),  <h/Jk  ,c.>  corresponds  to  o^(X).): 

rMAN»(y,h,c)  =  y+0  if  c  c  (h)  or  c  »  t 

MAW' (y,h,c)  =  max(f(y+W(c1),h//c.c1),  ,f (y+W(cm) ,h//c,cm)) 

if  c  e  N-  Jh} 

^initially  h  *=  X,c  c  N,y  «  0 

Hie  next  equivalence  allows  the  replacement  of  a  vector  in  the  data- 
structure  by  an  individual  component  at  the  expense  of  a  more  complex 
control  structure. 

Definitions: 

« 

o  is  a  sequence  of  symbols  in  the  alphabet  A. 

If  a  c  A»  Rjfa)  for  i  <  m(a)  is  a  sequence  of  symbols  in  a.  There  is 
a  subset  of  A,  say  A',  such  that  for  each  a'  e  A',  m(a’)  *  1,  and  Rj(a')  =  X, 

This  is  necessary  if  the  definition  of  f  in  the  following  theorem  is  to  be 
legitimate , 


66- 


Theorem  3.3:  Given  the  binary  operations  »  and  ♦  with  the  f ol lowing  properties : 
pi:  CXj  •  X2)  •  X3  -  Xx  •  (X2  •  X3) 
pis  CIjXj)  •  z  -  zicxi  •  Z) 
p3:  0  .  X  -  X  .  0  «  X 

and  given  the  definition: 


f(a)  *  0 


if  a  **  X 


•  dl  1  f0xl  "  riCa)  *  f  C“  C if  ° 


initially  o  =  <a>  where  a  e  A 


then: 


fact!:  f(a}  =  f(o1;^)  •  f(aj+1;4J) 
and  further  given  the  definition: 


d2  Jg(a)  "  Zi«l  Va)  *  «CCRiCa) *  gC[RiCa)l2)*...*g(CRiCaD]n  ) 


g(a)  =  0 


if  j  e  N 


if  a  e  A* 


if  a  i  A* 


, initially  a  e  A 


then: 


fact2:  fox  each  a  e  Af  fC<a>l  *  gCal 


Proof:  First  we  show  that: 


f(«)  - 

00  Assume  since  afeu  RjCaj)]  is  of  lesser  remoteness  than  a  that 
factl  is  true,  i.e.: 


Thus: 


f(«Cai  ■"  c  *(**(«))  ’  f(°2:n^ 

f(<0  .  (rjtaj)  •  CfCRjCoj)  •  fCo2.n))) 

*  •  fC“2:nn 

’  VV  '  f(-W»  '  t<-a2:r? 

-  fC«,)  -  ft 


by  pi 


by  p2 


°i:j  "  <0i'ai+l'  ’*•  '“j*  a  "  <alf  M'  '*n>'  "  <'*i> 


67- 


The  basis  for  this  induction  is  given  next. 

If  a  is  of  remoteness  1  then!  a[a^  R^C«)1  must  be  of  remoteness  O.must 
SO|  f(a)  «  ri(a1)  •  fCatc^  ♦  RjCa^) 

-  ZWl}  riCal5  *  f<x> 

.  lj£l>  r  since  f(X)  «>  0  by  and  p3 

Since  o[Oj  R^(oj)]  ■  X  a  must  *  <a.'>  with  a*  e  A' 

Therefore  if  a  is  of  remoteness  1  then  a  ■  <a'>,  m  af, 

°2<n  "  X*  and: 

fCal  *  f«a»>)  =  r.(a») 

■Cjfj0  r.Ca*))  •  fCX)  by  pi 

-  fC<a»>)  *  fCa2;n) 


GUI 

Now  assume 

that  for 

1  £  j  £  m  that 

* 

• 

fCal  « 

fC°lU1  ' 

this  has  already  been  shown 

them 

true  when  j  * 

1 

•  fC^.2 

:n> 

so: 

• 

f(a)  « 

(f(aj+1)  •  : 

r«Vl»»  * 

assumption  (HI) 

m 

'  fc ’ 

St2an  br 

p2 

m 

*  fCai+2:n) 

i >r 

(HI) 

This  completes  the  proof  of  factl.  factl  may  be  used  to  establish 
fact2.  By  dj  if  a  i  A': 

f«»>)  *  rjff  r.  u)  •  fCRjCa)) 

•  IS5  riCa)  ’  •  ...  • 


*and  if  a  e  A< ; 


t68? 

•  • 

f(<a>)  =  o 

wMote  that  these  last  two  equations  for  f(  a>)  are -almost  identical 
Xo  the  equations  for  g(a)  in  d^.  A  simple  inductive  argument 
i>ased  on  this  near  identity  shows  that: 

*(<«>)  *  g(a) 

•  «  • 

.Example  of  Application  of  Theorem  3.2 

This  example  involves  a  problem  which  together  with  its  ’good*  solution 
is  discussed  in  [33.  We  will  see  how  one  could  get  from  an  initial  formul¬ 
ation  of  a  recursive  solution  to  this  problem  to  the  ’good*  algorithm  of  [20 
•via  theorem  3^2.  The  problem  involves  the  multiplication  of  a  set  of 
jpatrices;  x  x  ...  Mn.  The  dimensions  of  each  matrix  is  given.  The 
-dimensions  of  are  (r^c^.  The  number  of  multiplications  necessary  to 

^multiply  by  is  then  r^  x  (c^  =  r^)  Xc. ,  The  multiplication 

Kj  x  x  .  x  Mn  may  be  associated  in  any  way  to  get  the  same  answer;  so 
x  M2)  x  M3  =  M1  x  0<2  x  Mj)  for  example;  but  with  differing  numbers 
X>f  multiplications  required.  The  problem  is.  to  design  an  algorithm  to  find 
^an ^association  which  will  give  the  minimum  number  of  multiplications.  In 
-fact,  we  will  design  an  algorithm  to  find  that  minimum  number  of  multipli¬ 
cations  -  which  is  the  heart  of  the  matter.  The  initial  approach  is  to 
-‘'somehow  enumerate  all  ways  of  associating  x  . , ,  x  -and  the  corresponding 
•“jcosts  in  number  of  multiplications  and  "then  to  choose  the  association  giving 
The  minimum  of  these.  Different  associations  are  equivalent  to  different 
amys  in  which  the  expression  Mj  x  ...  x  MR  can  be  parenthesized.  The 
last  matrix  multiplication  in  any  such  association  must  involve  one  of  the 
following  alternatives: 


-69- 


Mj  with  the  result  of  CM2  x  •  •  •  Mjj)  with  a  cost  of 

Xj  x  (Cj  ®  Tj)  x  ♦  the  cost  (=6)  of  doing  +  the  cost 

of  doing  (M2  x  ...  x  Mn)  or 

The  result  of  (M,  x  M_)  with  the  result  of  (M,  x  . , .  x  M„) 

with  a  cost  of  r2  x  (c2  =  r3)  x  cn  ♦  the  cost  of  doing  (Mj  x  M2) 

♦  the  cost  of  doing  (Mj  x  ...  x  Mn)  or  ,,, 

The  result  of  Qfj  x  x  with  with  a  cost  of 

r,  x  (c  .  *  r  )  x  c  ♦  the  cost  of  doing  (M.  x  ...  x  M  ,)  ♦ 

1  n-i  n  n  1  n-l 

the  cost  (=0)  of  doing  CMn,Mn). 

If  a  «<i1,j1>,<i2J2>»  ...  »<i„*^n>  and  "  the  cost  of  doing 
04.  x  ...  x  M.  )  +  the  cost  ofidoing  (M.  x  ...  x  M.  )  +  ...  ♦  the  cost 

h  J1  x2  x)2 

of  doing  (M.  X  . . ,  x  Mi  then: 

jn 

fGxl  *  0  if  o  *  X 

f(a)  =  f(op1  ♦  X*j)  if  X,  ij  * 

f(<0  *  MinCC^  x  ci  x  cn)  +  fCa^  *■  <ii»ii>»<ii+1» ^1  ••• 

...  »(.ri  *  cj1-l  x  cj  1  *  fC°[ai  *"  <i1»j1-1>»<ji»ji>3) 
initially  a=(<l,n>) 

let  a  -  <i,j>  in  which  ij  e  N  and  j  >  i.  If  m(a)  -  1  and  Rj(a)  *  X 
when  i  «  j.and  m(a)  *  j  -  i  and  Rk(a)  -  <i,i+k-l>,<i+k, j>  when  j  >  i 
then  the  above  recursive  definition  can  be  rewritten: 

£(„)  -  0  if  a  -  X 

f(<o  "MmCCr,  x  e  k.j  x  c3j) .  f(.c«i  *  W1)  if  -  * x 

k»l  to 

initially  a  -  «ltn» 

Means  remove  first  component  of  a  . 


70- 


Now  this  definition  fits  dj)  of  theorem  3.3  with  •  being  addition  and  Z 
being'  Min.  These  operations  have  the  appropriate  properties  so  applying 
the  theorem  we  get  the  following  definition  in  which  we  have  resubstituted 
for  Rj (Oj)  being  equivalent  to  the  previous  definition,  with  a  =  <i,j>, 
i,j  c  N  and  j  >  i. 

g(a)  =  0  if  i  ■  j 

g(a)  x  ci+k„!  x  cj)  +  gC<M+k-l>)  +  g(<i+k,j>) 

K«1  to  nCa-iJ  J 

1  if  j  >  i 

initially  a  *  <l,n> 


If  we  replace  gC<i»3>)  by  and  ^  x  c^j^  x  c^  by  a^^  in  the 

2 

above  definition  then  that  definition  amounts  to  a  set  of  Mx  equations, 
each  equation  being  of  the  form? 

*ij  e  n,in^ai,jl  *  xii  +  xi+lj*aij2  +  Xii+1'  '*•'  ^ 

This  set  of  equations, like  that  for  maximum  path  in  a  graph  with  no  loops t 
forms  the  basis  of  a  set  En  that  is  open-loop  consistent.  The  reason 

is  that  no  equation  in  the  equation  set  E^  will  have  the  same 
variable  on  both  sides  of  the  equation.  This  is  because  in  the  basis 

equations  when  X. .  appears  on  the  left  and  X.  appears  on  the  right  of 

1J  Kp 

an  equation  j-i  >  k-p,  and  so  the  same  variable  cannot  appear  on  the  right 

as  on  the  left  of  any  equation  in  En  since  this  property  will  be  preserved 

even  after  substitution. 

■  •  • 

A  process  analogous  to  that. described  for  maxim un  path  can  be  used. 

The  values  of  Xjj  with  i  »  j  is  known.  By  substituting  these  values,  the 
aoiution  of  all  X^  with  j-i  «  1  can  be  found,,  In  general  X^  with  j-i  =  k 
Can  be  determined  from  the  solutions  for  X^,  j-i  «*  k+1 


1111111  Xln  is  found  *  This  process  has  a  complexity  of  n  ..  . 

Conclusions; 

The  study  of  the  relation  of  recursive  definitions  to  their  'good* 
implementation  is  a  place  in  which  many  important  concepts  and  results, 
developed  in  diverse  regions  of  computer  science,  seem  to  come  together. 

One  has  here  a  natural  way  of  classifying  algorithms  according  to  properties 
of  their  recursive  definitions  which  cuts  across  relatively  superficial 
classifications  according  to  applications  areas.  The  strong  analogy  between 
recursive  definitions  and  differential  equations  on  the  one  hand  and  'good 
algorithms'  and  closed  Cor  otherwise  good)  solutions  on  the  other  hand, 
supports  the  expectations  that  this  study  is  a  place  to  bring  it  all 
together,  .  ,  — ?  —  '-r„. 


References: 

1.  Aho,Hopcroft,  Ullman;  The  Design  and  Analysis  of  Computer  Algorithms; 

‘  Addison-Wesley,  1975;  pp  195,222 

2.  Darlington,  J.  and  BUrstall,  R.M. ;  A  System  Which  Automatically 

Improves  Programs;  Proceedings  Third  International  Joint  Conference 
on  Artificial  Intelligence;  Stanford,  California,  1973;  pp  479-485 

Zx  Darlington,  j.  and  Burstall,  R.M.;  A  Transformation  System  for 
Developing  Recursive  Programs;  JACM;  January,  1976 

4.  Nillson,  N.;  Problem  Solving  Methods  in  Artificial  Intelligence; 

;  McGraw-Hill;  1971 

K  ■  . 

.. 5X  Strong,  H.R. ;  Translating  Recursive  Equations  Into  Flow  Charts;  Journal 
of  Computer  System  Sciences;  1971;  pp  254-285 

6,  Strong,  H,R,  and  Walker,  S.A,;  Characterization  of  Flowchart able 

Recursions;  Journal  of  Computer  System  Sciences.;  Vol  v  7,  14; 

August,  1973;  pp  407,447 

7,  Pauli,  M,C.;  Formulation  and  Manipulation  of  Enumeration  Based  Algorithms; 

Research  Report  SOSAP-TR-4;  December,  1973 

8,  Pauli,  M.C.;  Properties  Which  Allow  Optimizing  the  Implementation  of 

Recursive  Definitions  and  Notes  on  Searching  for  Some  Such  Properties; 
Research  Report  SOSAP-TM-5;  September,  1974 


Appendix  I 


Summary  of  Frequently  used  Notation 
If  P  is  a  predicate  the  F  means  not  P. 

M  is  the  set  of  all  positive  integers  =  {1,2,....} 

N  is  the  finite  set  of  integers  from  1  to  n  =  {l,...,n} 

If  A  and  B  are  sets 
Au  B  Is  set  union 
A  a  B  is  set  intersection 
X  is  the  complement  of  A 
A  -  B  *  A-A  B 

I A I  ■  the  number  of  elements  in  A 

<a •••<  *n>  is  an  ordered  set  or  vector  with  components 

:  i  e  N  and  a^  represents  the  subvector  <a^,a.+1,  ...  ,a^>;  ai#i  = 
If  A  and  B  are  ordered  sets  =  <aj»  •  an>  aud  <b^,  ...»  bR>  respectively 
K//  B  «  <aj,  ...,  an,  b^,  ...,  bn> 

(A)  is  the  set  of  all  components  in  A 
If  E,  x  and  y  are  each  an  expression,  i.e.  a  string  or  ordered  set  of 
symbols  from  a  given  alphabet,  usually  satisfying  some  constraints  as  to 
form,  then 

E[x  «•  y]  is  the  expression  that  results  when  each  occurrence  of  x  is 
replaced  by  y  in  E. 

The  notation  is  extended  to  allow  the  specification  of  a  number  of 
replacements  E[x  «■  y,  Z  *■  w]  is  the  expression  which  results  when 
each  x  is  replaced  by  y  and  each  Z  by  w  in  E. 


ii- 


An  entire  set  of  replacements  can  also  be  specified,  i.e.  if  E,  x^  and 
for  all  i  e  N  are  expressions 

E*[x^  y.  |i  e  N]  is  the  expression  that  results  when  each  occurrence 
of  x^  is  replaced  by  y^  in  E  for  all  i  e  N 

The  notation  also  composes  to  allow  specification  of  multiple  re¬ 
placements,  i.e. 

(E[x  «■  y])  [Xj  t-  yji  e  N]  *  E»[xi  y ±  1  i  e  N] 

where  E 1  =  E  [x  y] . 

The  notation  is  also  extended  to  specify  replacement  of  1  component  of  E. 

Thus  if  E  and  x  are  expressions  and  i  an  integer 

E[E^  ♦  x]  is  the  result  of  replacing  the  i**1  symbol  in  E  by  the 
expression  x 

EJE^.-^  x]  is  the  result  of  replacing  the  sub  expression  of  the 
i**1  thru  i+k**1  symbols  in  E  with  x. 

A  further  extension  allows  specifying  the  insertion  of  a  string  between 
symbols 

ElEj+  *■  x]  *»  E[E^+^  **•  x]  is  the  result  of  inserting  x  between  the 
'  ith  and  i+lst  symbol  of  E. 


SOSAP-TR-38 


May  1977 


MEMORY  EFFICIENT  IMPLEMENTATIONS  OF  RECURSIVE  DEFINITIONS 
M.  C.  Pauli 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

New  Brunswick,  New  Jersey 


This  research  was  supported  by  the  Advanced  Research  Projects  Agency 
of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6  to  the 
Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U.  S.  Government. 


1.  INTRODUCTION 

-  N 

•  # 

Typically  there  are  significant  differences  between  the  initial  formulation 
.of  an  algorithm  and  its  ultimate  implementation.  For  example  the  minimum 
path  between  two  nodes  in  a  weighted  di-graph  can  be  found  by  enumerating 
All  paths  between  the  two  nodes  and  choosing  the  smallest.  This  approach 
Can  easily  be  formulated  as  a  recursively,  defined  function,  which  may  in 
turn  be  implemented  in  a  standard  way.  This  is  significantly  different 
than  Dykstra’s  algorithm,  the  favored  shortest  path  implementation.  Oh 
the  one  side,  close  to  the  problem  statement,  then  there  is  an  initial, 
simply  formulated,  but  often  inefficient  algorithm.  On  the  other  side, 
nearer  to  the  final  implementation,  is  an  efficient  algorithm.  The  study 
of  the  connection  between  these  two  is  the  subject  of  this  paper. 

It  vill  be  assumed  that  the  initial  formulation  of  an  algorithm  is 

i 

ns  n  recursive  definition  and  that  this  definition  is  in  a  standard  form 
(to  be  given}.  The  standard  form  was  chosen  because,  firstly,  it  is  one 
Which,  in  our  experience,  has  frequently  arisen  naturally  as  an  initial 
algorithm  formulation.  Secondly  the  chosen  form  lends  itself  nicely  to  an 
overview  of  a  variety  of  possible  implementations  of  the  algorithm  thus 

i 

formulated.  The  recursive  definition  though  sufficient  to  provide  the  value 
of  the  function  anywhere  in  its  domain  is  non-deterministic  as  to  which  of 
0  Variety  of  sequential  implementations  are  to  be  used  to  determine  that 
value.  The  variety  of  implementations  correspond  to  the  various  orders  of 
substitution  which  are  equally  valid  in  evaluating  such  a  definition. 

Same  orders  of  evaluation 


become  possible  only  if  the  primitive  functions  which  enter  into  the  re¬ 
cursive  definition  have  appropriate  properties.  Different  orders  of 
evaluation  will  result  in  different  memory  requirements*  but  will  not  cause 
significant  time  differences  in  the  resultant  implementations.  This 
dependence  of  memory  requirements  on  the  order  of  evaluation  is  the  main 
subject  of  this  paperi 


Related  Work 

The  work  reported  here  is  in  an  area  of  study  in  which  there  have 

been  a  number  of  significant  publications.  Strong  has  identified  a 

recursive  definitions. for  which  memory  efficient  implementations 

(called  'flowcharts')  are  available.  This  class  is  defined  in  terms  of  a 

» 

recursive  scheme  whose  constituent  primitive  functions  are  virtually  un¬ 
restricted.  If  the  properties  of  these  primitive  functions  are  restricted 
somewhat*  a  wider  class  of  recursive  definition  fovms  will  yield  similar  memory 


•  •••  •  • 


•  • 


efficient  implementations.  Such  restrictions  are  considered  here  because 
they  arise  naturally  in  practice.  So  this  aspect  of  the  work  can  be 
considered  an  extension  Of  Strong's  results. 

Burstall  and  Darlington  studied  properties  of  recursive  definitions 

•  • 

whose  existence  allows  efficient  implementation,  with  one  objective  be¬ 
ing  the  incorporation  of. a  search  for  such  properties  in  an  optimizing 
[2] 

compiler.  Later  Burstall  and  Darlington  extended  this  study  to  consideration 
of  transformations  of  recursive  definitions  which  are  likely  to  produce. 

fid 

better  implementations .  The  spirit  of  our  work  here  is  largely  in  tune 
with  that  of  these  investigators  with  some  significant  differences  in 
enphasis  and  in  the  particular  properties  studied.  Our  emphasis  has  been 
mainly  on  understanding  the  complete  set  of  properties  which  allow  the 
transformation  from  an  initial  recursive  definition  to  the  best  algorithms 
actually  known  and  to  the  proof  of  this  connection.  Thus  we  tend  to 
consider  a  few  relatively  complex  sets  of  properties  and  transformations  as 
opposed  to  many  simple  ones.  We  also  study  mainly  one  form  of  first 
order1  recursive  definitions,  rather  than  the  many  forms  they  consider. 

The  remainder  of  this  introduction  is  devoted  to  a  sketch  of  the 
definitions  and  results. to  be  detailed  in  the  body  of  the  paper. 


j  First-order  means  a  definition  in  which  the  defined  function  symbol  never 
appears  nested  on  the  right. 


4 


/  Append  pc  I  contains  a  summary  of  most  of  the  notation  used  in  the 
•  paper.  (This  notation  is  also  defined  on  first  use  in  the  paper.) 


The  Standard  Form 

This  paper  concerns  the  implementation  of  recursive  definitions  of  a 
fraction  f(X)  in  a  class  F  in  which  every  definition  has  the  following 
fbra: 


f(X)  •  q(X)  if  T(X)  (terminal  condition 

and  values) 

f(X)  -  w(f(0l(X)).  ...  ,fCon00  (X)))  if  T(X)  (body) 


initially  XcDf 


(domain  of  function  f) 


where  the  data  structure  XeD^,  primitive  functions  w,q,o.eO,m,  and  predicates 

T  in  the  definition  collectively  designated  by  the  tuple  <D,w,q,o,m,T>  must 

he  constrained  so  as  to  make  I  a  terminating  >  definition. 

A  definition  is  terminating  .  :  If  for  each  deD-  the  sequence 

*  * 

•  •  •  •  •  , 

of  expressions  resulting  from  substitution  for  forms  f(a)  (where  a  is  any 

expression  using  X  which  starts  with  f(d),  d  c  D^,  and  next  produces 
*C^(ox(d)),  ....  etc«  has  the  properties: 

(1)  It  is  always  possible  to  evaluate  T(o)  and  if  T(a)  is  false 
it  is  always  possible  to  evaluate  m(a) ,  and  o^(o)  for  l<i£m(a) 


(2)  Independent  of  the  order  of  substitution  for  the  different 
appearances  of  the  form  f(a)  after  the  same  finite  number  of 
such  substitutions  a  'terminal'  expression  will  be  obtained 

in  which*  for  every  appearance  f(a)#  a  is  terminal  (i.e.  T(o)  is 
.true)  and  q(o)  can  be  evaluated. 

(3)  The  function  w  is  defined  so  as  to  make  it  possible  to  evalu¬ 
ate  the  terminal  expression  in  any  order  consistent  with  its 
parentheses  structure. 

.  the  tuples  <D,w,q,0,m,I>  which  satisfy  the  above  constraint  are 
members  of  the  set  V.  The  set  of  definitions  of  form  I  which  satisfy  these 
constraints  constitute  the  recursive  scheme  F(V). 

This  form  of  definition  often  arises  in  practice  as  an  initial  solu¬ 
tion  to  an  algorithm  design  problem,  particularly  when  the  problem  can  be 

t 

viewed  as  requiring  an  enumeration  or  an  enumeration  followed  by  a  selec¬ 
tion  (search,;.  The  examples  of  recursive  definitions  in  F(V)  given  below 
arose  from  adopting  such  a  point  of  view.  Their  structure  can  be 
easily  seen  by  evaluating  them  for  some  small  initial  values  of  their 
arguments . . 

Examples: 

Ex.  1.1  If  f(X)  is  to  be  the  set  of  all  n  bit  binary  numbers  (let  M 
be  the  set  of  positive  integers),  then: 

,X  c  (<o,n>|a  a  string  of  0's  and  l's,  n  e  M) 

C  f(X)  »  f(o,n)  »  (a)  if  n  ■  0 

J  f(X)  ■  f(a,n)  ■  f(ay<0>,n-l)  u  f(o^<l>,n-l)  if  r 
where  is  string  catenation,  andu  set  union. 

X  initially  c(<X,n>|neA/> 

Xlhenex.  f(X,2)  -  f(<0>,l)u  f(<l>,l)  -  (f(< 0,0,0)  u  f(<0.1>,0))  u  f(<l>,l) 
•  (<00>}u  f(<0,l>,0)  u  f(<l>,l)  ■  etc. 


6 


Ex*  1.2  If  f(X)  is  the  set  of  all  permutations  of  the  first  n  integers, 

X  c  {<n,a>|n'EMf  a  is  a  string  of  positive  integers) 
ff(X)  -  f(n,a)  »  (a)  *  if  it  *  0 

and  if  p  s  | a  |  «  the  length  of  a  then 
Jf(X)  «  f(n,a)  =  f(n-l,a[cy;*n])  u...  u  f(n-l,  «[ap+-'-n])  if  n  >  0 
where  a[a^+-«-  n]  is  an  inserting  function;  i.e. 
if  a  ■  <a^,...,ctp>then  a[cu+-«-  n]  is  the  result  of  inserting  the 
integer  n  after  component  cu  in  a  or  is  <a^,.. .,cu_jfn,«*j+j,  ...  ,a  >. 
is  initially  e  {<n,A>|neM} 

Then  ex.  f(2,A)  =  f(l,<2>)  =  f(0,<12>)  u  f(0,<21>)  «  {<12>1  u  f(o,  21>) 

*:  *  •  {<12>J  u  (<21>) 


Ex.  1.3  f  CXJ  is  the  string  of  moves  (each  a  pair  of  numbers  <a,b> 

meaning  move  a  disc  from  pin  a  to  pin  b)  necessary  to  optimally 
,  •  solve  the,  now  classical.  Tower  of  Hanoi  puzzle.  To  move  n 

discs  initially  on  pin  1  to  pin  2: 


X  c  {«x,y,z>,n>|<x,y,z>  is  a  permutation  of  <l,2,3>,n  c  W) 
f(«x,y,z>,n>)  »  <x,y>  if  n  «  1 

f(«x,y,z>,n>)  -  f(«x,z,y>,n-l>^f(«x,y,z>,l>)>^ 

i  f(«*,y,x>,n-l>)  if  n  >  1 

X  is  initially  c  {«l,2,3>,n> IncM) 


Algorithms  to  Implement  Definitions  in  F (V)  which  are  Efficient  in  Use  of  Memory 
An  'algorithm  scheme'  defining  a  set  of  algorithms  is  defined  in  a  manner 
analogous  to  that  used  in  defining  a  recursive  scheme  like  F(V).  In  this  paper 
algorithm  schemes  generally  will  involve  standard  assignment  and  conditional 
Statements  using  the  same  unspecified  set  of  data-structures  0,  primitive 
fractions  w,q,0,m  and  predicate  TCdesignatcd  by  tho  tuple  •  <D,w,q,0,m,T>) 


used  in  defining  the  recursive  scheme  F(V).  If  we  constrain  the  selection 
Of  tuples  to  be  a  member  of  a  set  V,  the  set  of  algorithms  thus  defined 
is  designated  S (V)  and  a  particular  algorithm  c  S(V),  corresponding  to  a 
tuple  v  c  V  is  designated  S(v).  The  recursive  and  algorithm  scheme 
F(V)  and  S(V)  are  equivalent  iff  for  each  v  c  V,  F(v)  is  equivalent  to 
S(v).  A  recursive  function  definition  F(v)  and  an  algorithm  S(v)  are 
equivalent  if  with  domain  D  in  v,  for  every  d  c  D,  the  value  of  f(d)  as 
computed  with  recursive  definition  F(v)  =  value  of  the  result  of  running 
the  algorithm  S(v)  with  d  e  D  as  its  initial  value. 

•  .  •••  • 

The  main  purpose  of  this  paper  is  to  show  that  for  a  set 

V,  built  from  V  by  constraining  the  function  w  to  be  * associative * 

.  2 

and  the  set  of  functions  0  to  have  an  ’.inverse’,  there  is  an  algorithm 
scheme  S(V’)  equivalent  to  F(V')  which  is  particularly  efficient  in  its 
use  of  memory.  The  algorithm  scheme  available  when  these  conditions  are 

satisfied  is  given  in  figure  2.2.  The  algorithm  scheme  S(V')  is  given 

•  •  •'  •  • 

in  terms  of  the  data-structures ,  primitive  functions  and  transformations 

of  these  primitive  functions  (inverse  of  0  for  example)  which  are  immediate¬ 
ly  available  under  the  assumption  of  the  existence  of  an  'inverse',  that 

appear  in  the  equivalent  recursive  scheme  F(V'). 

n  ’ 

For  many  of  the  recursive  definitions  in  the  class  F(V'),  the  equi¬ 
valent  member  of  the  class  S(V')  -  which  can  be  obtained  mechanically  from 
the  recursive  definition  is  the  'good'  algorithm  usually  used  to  realize 
that  definition.  Thus  corresponding  to  example  1.1,  the  algorithm  ob¬ 
tained  by  instantiation  of  that  particular  D,w,q/) ,m  and  T  in  S(V') 
is  one  in  which: 

•  *  • 

2  These  terms  are  defined  in  section  2.  An  inverse  operation  plays  a  similar 
role  in  (6).  Our  'inverse',  however,  is  different,  having  been  independently 
developed  (7,83  in  combination  with  associativity  to  delineate  another  class 
of  definition  with  efficient  implementations.  The  result  is  in  Theorem  2.2. 


First  a  string  of  n  0's  is  formed  and  outputted  -  being  the  first 
binary  number  produced,  then,  because  the  rightmost  symbol  in  the  string 
is  a  0  it  is  changed  to  a  1  and  the  result  outputted.  In  general,  the 
algorithm  remembers  the  last  binary  number  formed  and  outputted,  say  X. 

The  next  binary  number  is  formed  by  a  scan  of  the  bits  of  X  starting  with 
the  rightmost  bit,  and  changing  them  by  the  following  scheme.  Let  b  be 
the  bit  under  scrutiny  -  if  b  is  a  0  it  is  changed  to  a  1  and  the  result 
is  the  next  binary  number  to  be  outputted  -  if  it  is  a  1  it  is  changed  to 
a  0(  b  becomes  the  bit  in  X  one  position  to  the  right  of  the  current  b 
and  the  scrutiny  is  repeated.  When  the  leftmost  bit  of  a  number  X  be¬ 
comes  b  and  that  bit  «  1  then  the  process  terminated.  In  summary  this  al¬ 
gorithm  for  producing  all  n-bit  binary  numbers,  consists -simply  in  'adding 
1*  to  produce  successive  members  of  the  set.  It  is  the  'good'  algorithm 
for  producing  the  set.  It  keeps  in  memory  only  the  last  number  produced  thus 
using  an  amount  of  storage  roughly  equal  to  that  required  to  hold  the  argu¬ 
ment  of  f  in  its  recursive  definition.  This  is  characteristic  of  all  the 
algorithms  in  S(V')  in  relation  to  the  equivalent  member  of  FCV')  and  is 
the  'memory  efficiency'  mentioned. 

In  a  similar  way,  the  algorithm  for  example  1.2  obtained  by  instantia¬ 
tion  of  the  primitives  that  appear  in  the  recursive  definition  in  example 
2.2  produces  one  permutation  at  a  time.  A  permutation  is  produced  from  the 
previous  permutation  by  interchange  of  adjacent  terms.  This  again  is  the 
'good'  algorithm  for  generating  permutations. 

Creating  an  Inverse 

In  examples  1.1  and  1.2,.  the  given  0 -functions  had  an  inverse  •  in 
•xaiple  1.3  tho  0- function  as  given  docs  not  have  an  inverse  and  thus  the 


igOritfim scheme  s\v*  j  is  not  avaiiaoie.-  however,  as  wixVoe  snown  -  when 
in  a  recursive  definition,  f  e  F(V),  the  O-function  does  not  have 
•n  inverse  -  a  simple  transformation  of  f  to  an  equivalent 

definition,  say  f',  involving  an  O-function  having  an  inverse 


can  always  be  found  in  F(V.').  Thus  f'  will  have  an  equivalent  in 
$(V').  This  new  definition  f'  is  equivalent  to  f  in  the  sense  that  to 
each  argument  d  of  f  there  is  a  'simply'  computed  . argument  d'  of  f* 
such  that  f'(d')  a  f(d).  Using  this  transformation,  an  equivalent  defini¬ 
tion  to  that  of  example  1.3  will  be  given  subsequently,  whose  equivalent 
algorithm  in  S(V')  will  produce  the  moves  necessary  to  solve  the  Tower 
of  Hanoi  problem  -  one  at  a  time,  the  only  temporary  memory  necessary 
being  that  for  a  record  of  the  previous  move  and  its  number. 

Mcmorjf  Efficiency 

In  the  standard  compiler  implementation  of  a  recursive  definition 
of  the  form  of  I t that  definition  is  taken  to  describe  a  procedure  which 
calls  itself.  The  procedure  uses  a  stack  to  temporarily  remember,  amongst 
Other  things  the  set  of  argument^the  data  structure) associated  with  the 
T*  0111  *  The  s,ai*  to  which  the  stack  grows  varies  ?and  depends  on  the  depth 
of  the  calls.  In  general  }  if  the  definition  is  non- linear, -i.e.  has 
more  than  1  call  of  the  defined  function  on  the  right,  then  the  arguments 
of  the  w  functions  will  have  to  be  stacked  also.  When  the  memory  eff¬ 
icient  algoithm,to  be  described  here  y.s  applicable  then  both  of  these 
Stacks  can  be  eliminated.  Instead  only  1  copy  of  the  argument  of  the  call¬ 
ing  function  will  be  saved.  All  other  tenporaxy  memory  uses  in  the 
algor  thm  are  comparable  to  those  in  the  standard  implementation.  It 
will  be  possi.ble.to  eliminate  the  need  for  these  stacks  for  any  definition 
Of  form  I,  provided,  only,  as  we  have  .said  that  the  w  function  is 
.  0**ociative  and  the  0  functions  have  a  uniform  inverse. 

Although  the  'memory  efficient'  algorithms  of  S(V')  are  honestly 

so  for  the  most  part,  the  nature  of  the  memory  efficiency  can  be  mis- 

•  • 

leading.  The  implementing  algorithm  available  when  w  is  'associative' 

Wd  th*  °-fw*ion  has  an  'inverse*  is  efficient  in  the  sense  that  the 
*C*0ry  re*,,ired  is  usually  of  the  order  of  the  largest  storage  required 
for  the  argument  (ulso  called  a  data  structure)  of  f  which  arises 
-  ^  *  **  evaluated  by  successive  substitutions. 

’TrSZ&V!'*3  Ur*Mt  data'5,ructu”  for  whlch  “wry  need  be  provided 


>w-  ■ 

•  * 

•  v.  ’  •  '•  .•«•••  •• 

requires  a  small 'amount  of  memory  relative  to  the  total  of  all  data-structures 

produced  during  the  implementation  of  the  definition  for  a  given 
initial  data-structuTC  -  ex.  of  the  order  of  a  single  member  of  a  set 

When  a  set  is  being  enumerated.  Even  when  the  'inverse'  does  not  exist 

it  can  be  incorporated  as  previously  not^d,  leaving  the  'memory 

•fficiency'  notion  still  viable.  However  there  is  another  way  of  obtaining 

a  'memory  efficient*  equivalent  algorithm  which  is  deceiving. 

This  technique  involves  obtaining  a  technically  correct  equivalent 

recursive  definition  of  f,  say  f' -having  only  one  occurrence  of  f'  on  the 

right,  but  in  compensation  involving  much  larger  data  structures  X'  and 

complex  function  o£  than  the  corresponding  X  and  o^  of  f.  That  is  ,  for  each 

definition  of  form  I  there  is  an  equivalent  definition 

of  the  form: 


{f*(X')  -  q'(X»)  if  T(X') 

f»(X»)  -  w(f*(o»(X»)J)  ifF(X') 

Initially,  X'  e  D^f 

By  equivalent,  we  mean  that  there  is  a  1-1  correspondence 

•  • 

f  between  and  D^t  so  that  for  each  d  e  D: 


f(d)  -  f'(g(d)) 
* 


If  f*  has  an  inverse  then  it  can  be  realized  in  the  same  memory  efficient 


manner  as  other  definitions  in  F(V')  and  if  not  it  can  easily  be  modified 


go.  as  to  have  one  while  still  keeping  the  result  in  the  form 

Of  II.  Memory  efficiency,  however,  means  that  the  memory  requirement 

will  not  exceed  the  size  of  the  largest  data-structure  which  arises  as  an  argu- 

nent  of  f  during  evaluation  of  f'.  But  in  this  equivalent  definition  that 


The  two  classical  ways  this  can  be  done  are  by  constructing  a  general 
breadth-first  or  depth-first  algorithm  to  implement  tne  recursive  definition 
Of  form  1  and  then  equivalently  giving  these  as  recursive  definitions, 


data-structure  is  typically  much  larger  (often  exponentially)  than  that 
.  which  could  arise  in  the  original  definition. 

The  term  ’memory  efficiency'  as  used  here  then  requires  caution 
in  its  application. 

i 

2.  MEMORY  EFFICIENT  IMPLEMENTATIONS 

The  first  part  of  this  section,  thru  page  16,  is  largely  devoted 
to  material  which  is  probably  familiar.  This  is  done  inorder  to  develop  the 
definitions  of  a  number  of  terms  which  are  used  later  in  this  section.  Altho 
the  concepts  are  familiar  the  terms  we  use  may  not  always  be  so. 

I*  any  case,  the  material  in  these  preliminary  pages  can  easily 

**..  >  • 

W  fca  skipped  and  only  referred  to  to  pick  up  definitions  of.  terms  used  later, 
without  losing  the  main  point  of  the  paper. 


I 


i 


1 


Consider  the  set  F*  of  all  functions  f  that  can  be  defined  as 


follows: 
Def.  2.1 


form 


f(X)  -  q(X) 


if  T(X) 


f(X)  -  wCfCojCX)),  ...  ,  f(°m(x)00))  if  TOO 


initially  X  e  D . 


where  the  primitive  functions  and  predicates  which  are  used  in  the  de¬ 
finition  are  weakly  constrained  as  to  the  nature  and  extent  of  their 
domains  and  ranges.  is  the  set  of  initial  data- structures  and  may  be 
any  set.  Other  sets  must  be  included  in  some  of  the  domains  of  some  of 
the  primitive  functions.  These  other  sets  are  defined  recursively,  using 
the  primitive  functions.  First  these  sets  are  named  and  their  relation  to 
the  primitive  functions  given,  then  they  are  defined. 

l  is  a  function  whose  domain  must  include  the  set  and  whose  range  is  the 
positive  integers  21.  m(X)  2  1  for  all  X  e  A^ 

0  i*  a  set  of  functions  {Oj,o2,  ...  }. 

The  domain  of  &  must  include  the  set  A*  is  the  union  of  all  the  domains 
■  *  *  ,  * 
of  the  functions'  in  o^  and  is  called  the  domain  of  the  O-function, 

The  range  of  o^  must  include  the  set  p*, 

.  The  union  of  the  sets  p^  of  all  the  functions  in  0  is  the  range 
•f  the  0- function^ and  is  called  P^, 

T  is  a  predicate  whose  domain  includes  D^u  P^.  Its  range  is 
' {true,  false) 

^  is  a  function  whose  domain  must  include  Q^.  Its  range  may  be 
any  set,  say  w*. 


W  is  a  function  whose  range  is  called  and  whose  domain  must  include 

*it»v 

The  sets  named  above  are  defined  as  follows  (the  subscript  f  is 
dropped  where  it  is  not  essential}: 

A*  ■  (d|d  e  D  and  T{d)};  and  for  j  >  1 

l?  -  {c^OQjx  e  A*’1  and  i  <  m(X)  and  TCo^X})} 

*  •  »H 


The  set  5  of  o^  e  0  is: 
d1  ■  '  {X|X  c  A  and  i  s  m(X)} 

The  range  of  ois: 

P  «*{bi(X)|X  e  A  and  i  £  m(X)} 

The  range  p  of  o.  e  0  is: 

P1  •*foi(X)lX  c  A1} 

.  *  * 

The  set  of  terminal  data-structures  Q  is: 

Q  -  P  -  A 

•  • 

The  set  W  is  defined  as  follows: 

W*  ■  iw(XJ,...,Xn}  1X^  e  Wf,n  «  a  positive  integer  f/};  for  j  >  1, 
«*  •  {w(X^,...,y  iXj-c  »,t,kU)s  n  I  w 

■  -  "I.!** 

If  in  addition  to  being  a  member  of  the  set  F*#  a  recursive 


definition  is  terminating 


i  as  defined  below  it  is  a  member 


•f  the  set  JF.  He  need  some  preliminary  definitions. 

If  <i,,  ...,i  >  ■  I  is  a  sequence  of  integers  then  o..  .  -  (X)  ■ 

*  n  n 

0,(X)  is  an  abbreviation  for  o A  (...  Oj  (o^,  (X))  o^(X)  -  X. 

n  2  1 


A  length  1  sequence  of  integers  <i^>  is  applicable  to  a  data- 

‘Structure  X  e  Af  if  i^  S  m(X).  A  length  n  sequence  of  integers 

<ij,  ...»  in>  is  applicable  to  a  data-structure  X  if  <i^f  ...»  in 

is  applicable  to  X  and  <i  >  is  applicable  to  o  .  .  (X). 

n  V-l 

£0Q  is  the  set  of  all  integer  sequences  applicable  to  X  e  A^. 

•  £*S  terminating  iff  Vd,  d  eD  implies  T(d)  is  finite 

Note  that  if  I(X)  is  finite  it  cannot  contain  an  infinite  sequence, 
because  it  always  contains  all  prefixes  of  any  sequence  it  contains. 

This  completes  our  definition  of  F?  Next  we  give  ,  some  simple 
consequences  of  the  definition  which  will  be  used  later.  First,  the 
stibstitutionally  solvable  property  that  d  e  D,  T(d)  is  finite  can  be 
extended  to  any  X  c  A^.  This  is  done  in  lemmas  2.1  and  2.2. 


Simple  Properties  of  f  eF 

Lemma  2.1:  If  f  e  F  and  X  e  A^  then  3  an  integer  sequence  I  e  1(d) 
and  3  a  data-structure  d  e  D  such  that  Qj(d)  =  X. 


Proof: 


Lemma  2.2: 


If  X  c  A^  then  obviously  there  exists  some  c  (at  least  1) 
such  that  X  e  A  .  The  lemma  is  proven  by  induction  on  the  sets 
Assuming  there  is  a  length  k-1  sequence  Iy  for  each  data- 

1  t.i 

Structure  Y  e  A  and  d  c  D  such  that  o,  (d)  -  Y.  Then  it  follows, 

*y 

ty1  definition  of  A^  that  if  X  c  A*  then  X  *»  o^QQ  for  some  is  m(Y) 
and  Y  e  A1*1,  Thus  X  ■  OjCoj.  (d))  -  Cd)-  Since 

j  .  *  ^ 

also  D  =»  A  ,  and  o^(d)  ■  d  for  each  d  c  D,  the  proof  is  complete. 

If  f  c  F  and  X  c  A^,  then  7(X)  is  finite. 


Proof:  From  tho  previous  lemma  the  data-structure  X  ■  Oj(d)  for 

tome  d  c  D  and  integer  sequence  I.  Therefore  7(d)  o  the 


•  P  as  defined  hero  is  tho  samo  as  F(V)  as  dofined  in  tho  Introduction 


set  consisting  of  I  concatenated  with  each  member  of  7(X). 
Thus  if  1(X)  is  not  finite,  1(d)  cannot  be  finite  but  this 
contradicts  the  condition  that  f  c  F  is  substitutional ly 
solvable. 

Another  consequence  of  the  definition  of  F  is  that  the  data- 
structures  in  A^  can  be  usefully  ordered  in  another,  almost  reverse. 
Banner  than  the  ordering  by  membership  in  the  subsets  .  In  most  of 
the  subsequent  inductive  proofs,  induction  will  be  carried  out  on  this 
ordering.. 

Ordering  the  Data-Structures  in  A  (Remoteness) : 

•  For  any  function  f  in  F: 

We  say  a  data-structure  X  in  A^  u  is  of  remoteness  0  (or  is 
terminal)  if  X  e  Qf. 

We  say  a  data-structure  X  in  A^  u  Q£  is  of  remoteness  n  if: 

(1)  3*:i  $  m(X)  and  o^(X)  is  of  remoteness  n-1  and 

(2)  Vi:iSm(X)  implies  o^(X)  is  of  remoteness  n-k  and  kfcl.  6 

lemma  2.3:  If  f  e  F,  then  there  is  a  function  r  with  domain  A^u 

such  that  if  X  e  A*u  Qe  then  r(X)  =  the  remoteness  of  X. 

. .  ■■■■■■—  ■■■■■■  ■■■!  i  1  ■  " 

Proof:  For  each  X  c  A^u  let  r(X)  be  the  maximum  of  the  length 

©1  all  the  sequences  in  I  (X) .  For  each  X  c  A^u  Q^,  X  is  of 

.  *  • 

* 

€  Alternately  this  can  be  phrased  'of  remoteness  <  n'. 


•16" 


remoteness  r(X).  This  is  shown  by  induction.  If  T(X)  then 

.  .  * 

I(X)  is  empty  and  r(X)  »  0.  Assume  that  if  r(X)  <  n,  X  is 
of  remoteness  t(X).  Let  r(X)  *  n,  i.e.  there  is  a  longest, 
sequence  of  length  n,  say  I  *  <ij,  ...  ,in>  in  ?(X).  Let 
Oj  (X)  *  Y.  Then  I*  =  <1^,  ...  ,in>  is  in  I(Y).  Furthermore, 
mo  sequence  applicable  to  Y  is  longer  than  I*  because  other¬ 
wise  I  could  not  have  been  a  longest  sequence  in  I(X).  So; 
r(Y)  *  n-1  and  Y  is  of  remoteness  r(Y)  =  n-1.  Therefore,  ] 
since  o± (X)  *  Y  and  for  all  j  ji  ij,  j  <  m(X),  rCo^CX))  <  n-1, 

X  is  of  remoteness  r(X)  *  n  by  definition  of  remoteness,  i 


Properties  of  f  e  F  Sufficient  for  Memory  Efficient  Implementations 

•  .  * 

j 

An  efficient  implementation  becomes  '  j 

available  when  the  recursive  definition  f  c  F  has  some  special  properties. 
These  properties  are  now  defined. 

Associativity:  Associativity  has  the  usual  meaning  here.  The  function 
w  is  associative  if: 

*(al»*2»  ••••  %,)  •  wCwfaj.aj),  a3,  ...,  a.J  for  m  *  3 


W  »  minimum,  sum,  catenation  and  union  provide  examples  of  w-functions 

with  this  property.  In  each  case  one  can  compute  w(a. ,  ...,  a  )  as  follows 

X  81 

*  ♦  K 

For  i  ■  1  to  m 
T  ♦  w(X,a.) 


thus  requiring  at  any  one  time  memory  for  at  most  2  copies  of  the  result 
of  w(a^«  3j),  j  S  n.  If  w  is  the  function  minimum,  this  memory  does  not 

increase  on  the  number,  but  only  on  the  value  of  its  arguments,  a^  If  w  is  catena¬ 
tion,  sum,  or  union  the  memory  required  will  increase,  albeit  at  different 
Tmtes,  with  the  number  of  arguments.  There  is,  however,  a  significant 
difference  in  use  of  the  memory,  between  a  computation  of  catenation  and  of 
union.  To  obtain  catenate  (a,b),  b  needs  only  be  attached  at  the  end  of 
a.  To  obtain  the  union  (a,b),  a  must  be  searched  for  an  occurrence  of  a 
■ember  of  b.  If  a  represents  the  result  of  a  previous  computation  then  in 
the  union  case  it  is  necessary  to  re-access  this  memory  whereas  this  is  not 

necessary  in  the  catenation  case.  This  is  an  important  consideration  be- 

•  *  » 

cause'  memory  that  is  not  re-accessed  can  be  located  in  areas  of  memory 
(disc)  which  need  not  be  easy  to  access  (as  is  core) .  The  temporary 
memory  requirements  for  the  ira . ementation  of  a  function  then  do  not  depend 
on  the  usual  mathematical  properties  of  that  function  only,  but  also  de¬ 
pend  on  the  means  available  for  accessing  the  memory?  Nevertheless,  for 
compactness  our  results  are  given  in  terms  of  the  usual  mathematical 
properties — so  caution  is  needed  in  their  interpretation. 


Consider  a  set  of  functions  H  =  {h^,  ....  hM>.  Let  tL  be  the  domain 
over  which  h^  is  defined  and  let  be  the  corresponding  range  of  h^. 

Then  we  will  say  P*  is  the  domain  of  H  and  R  “  °1  Ri  is  its  range. 

The  set  of  functions  H  is  said  to  have  a  uniform  inverse  on  the  domain 


^  It  is  also  true  that  there  may  be  some  advantage  in  time  efficiency  in 
One  grouping  of  the  arguments  of  w  over  another  though  both  give  the  same 
result  when  w  is  associative.  An  example  of  such  n  function  is  merge,  i.e. 
■crgcCa^,  . ,aa)  in  which  u?  arc  each  finite  sorted  sets  of  numbers*. 


domain  p  if: 


(1)  Every  h|H  has  an  inverse  and 

(2)  <f>  for  every  ^  R^in  R. 

If  H  has  an  uniform  inverse  then  it  is  easy  to  see  that  the 
following  two 'uniform  inverse '  functions  on  R  exist  for  re  R4<CR. 


(1)  H**  (r)»  d  D  such  that  H^(d)*r 

(2)  i^r^l  the  index  of  the  range  of  which  r  is  a  member. 

A  recursive  definition,  fe  F,  has  a  uniform  inverse  if  the  set  of 
functions  0|€  o  in  f  has  a  uniform  inverse. 

For  a  given  function  set  0  it  is  possible  that  none,  one,  or  two 
-1  • 

of  the  pair  H  ,  lj£  exist.  Despite  the  fact  that  the  uniform  inverse 
is  a  strong  condition  it  does  often  occur.  Furthermore  when  it 
doesn't,  there  is  always  a  strongly  equivalent  definition  which  does 
have  a  uniform  inverse.  This  is  shown  after  a  short  degression  re¬ 
quired  to  define  strong  equivalence. 


Equivalence  of  Recursive  Definitions: 
Consider  two  definitions  in  F: 

(1)  f  on  domain  D 


f(X)  -  q(X) 


if  T(X) 


f(X)  «  wCfto^X)),  f(oB(x)CX)))  if  T(X) 
initially  X  =  d  e  D 


(2)  g  on  domain  D* 

{g(X')«q'CX')  .  if  T*  (X') 

g(X')  -  W(gCoJCX').  ....  2Co;f(J0CX*)))  if  T'  (X1) 
initially  X'  *  d'  e  D' 


If  there  is  a  1-1  correspondence  between  D  and  D'  such  that  whenever 
d  c  0  and  d'  e  D  are  two  corresponding  data-structures  f(d)  =  g(d') 

.  i 

then  the  two  definitions  are  equivalent.  The  above  correspondence  may  be 

extended  to  one  between  A-  and  A  with  ic  L  corresponding  to  6'  e  A 

x  g  x  g  | 

by  having  o^d)  correspond  to  oj(d')  whenever  6  corresponds  to  6'  and 
0^(6)  and  o£(d*)  are  both  defined.  This  is  called  a- structural  corres¬ 
pondence.  If  in  addition  to  such  a  structural  correspondence  of  A^  to 
A  the  following  conditions  hold 


(1)  T(d)  *»  T* (d*) 

(2)  q(fi)  -  q»C«')  if  T(«)  (andT'CS*)) 

(3)  »(d)  «  m’(d’)  if  T(d)  (andT*(6»)) 

(4)  w  «  w* 

then  f  and  g  are  strongly  equivalent. 


Strong  equivalence  of  two  definitions  implies  that  they  not  only  give 

the  same  results  but  also  require  the  same  number  of  substitutions  in 

•  ^ 

their  evaluation  f0r  corresponding  initial  arguments.  • 

As  an  example  of  a  strong  equivalence,  consider  the  two  functions^  and 

I 

g  each  in  F: 


CD 


ff(X)  =  q(X) 


jf(X)  *  w(f(0l(X)),  .... 
initially  Xs  d  e  D 


if  T(X) 
f(oBcx)CX)))  if  T(X) 


CD  fa)  g(X,Y)  =  q(X) 

b)  g(X,Y)  -  wCgCOjCXD.hjCY)),  ....  g(<>iaCx)CD,hm(x)(Y))) 
initially  <X,Y>  *  <d,y^>  e  D*  with  de  D  and  y^  fe 


if  jrcx) 

if  T(X) 


CH  *  {hj,  ...»  hy}  is  a  set  of  primitive  functions) 


Let  data-structure  dc  D  correspond  to  e  D*.  Extend  this 

correspondence  to  one  between  A^  and  A^  by  letting  o^(X)  e  A^  correspond 
to  <©^(X)  h^(Y)>  e  Ag  whenever  X  e  A^  corresponds  to  <X,Y>  e  Ag  and  T(X) 
and  i.£  m(X).  For  example  if  dc  D  and  o^(d)  is  defined  then  it  corres¬ 
ponds  to  <o.(d),  hi(ydl>  c  Ag. 

Fbr  each  member  of  A^  this  correspondence  defines  a  corresponding 
aeaber  of  A^.  This  follows  because  every  member  in  A^  is  either  in  D, 
for  which  the  correspondence  is  given  explicitly,  or  it  =  o^(X)  for 
Xc  Af  and  o^  is  defined  and  TCX) ,  in  which  case  the  correspondence  to 
•  member  of  A^  is  given  since  o!(X,Y)'s  existence  just  depends  on  X,  be¬ 
cause  m*  (X,Y)  -•*  ra(X),  T»  (X)  •  T(X). 

Conditions  (1)  through  (4)  are  obviously  satisfied  for  this  corres¬ 
pondence  in  the  above  definitions.  Furthermore,  the  function  g(X,Y)  is 


21- 


independent  of  Y,  its  second  argument.  This  is  shown  inductively  as 
follows.  Directly  from  the  definition  (2a)  we  see  -that  g(X,Y)  is  inde¬ 
pendent  of  Y  when  (X,Y)  is  of  remoteness  0.  Its  being  of  remoteness  0 

is  also  independent  of  Y.  Referring  to  (2b) ,  if  it  is  assumed  that  each 
term  g(o^(X)  ,h^(Y))  appearing  on  the  right  is  independent  of  its  second 
argument  then  it  follows  certainly  that  g(X,Y)  on  the  left  of  (2b)  is 
independent  of  Y.  If  the  argument  on  the  left  side  of  (2b)  is  of  remote¬ 
ness  n  from  terminal  then  all  the  arguments  of  terms'  on  the  right  are  of 
remoteness  <  n  from  terminal.  Thus  the  inductive  argument  is  completed 
concluding  that  g(X,Y)  is  independent  of  Y  if  X  and  thus  if  (X,Y)  is  of 
remoteness  0,1,2,  ...  ,  n. 

Thus  definition  (2)  can  be  rewritten  removing  Y  which  with  f  replacing 
g  is  the  same  as  (1).  Therefore: 

Lemma  2.5  g  and  f  above  are  strongly  equivalent 

Since  the  value  of  g(X,Y)  is  independent  of  Y  it  may  seem  silly  to 
ever  construct  such  a  definition,  with  a  'redundant'  Y,  to  replace  f,  or 
alternatively  that  such  a  redundant  Y  would  arise  inadvertently  in  g  to 
be  removed  by  replacement  with  the  equivalent  f.  The  following  theorem, 
however,  demonstrates  that  such  'redundant'  additions  can  be  of  consider¬ 
able  use. 


Proof: 


If  f  already  has  a  uniform  inverse  it  serves  as  its  own  • 

» 

i 

Strongly  equivalent  definition.  If  not  the  following  defi¬ 
nition  serves  that  purpose.  Referring  to  Def.  2.1  for  the 
definition  of  f,  ,the  following  function  g  defined  in  terms 
of  the  same  sets,  primitives  and  predicates  is  strongly 
equivalent  to  f.  (p  «  <p^  ...,  p *  is  a  vector  which 
records  indices,  and  d  is  the  initial  data  structure.) 

i 

♦ 

fg(X,p,d)  »  q(X)  !  if  T(X) 

g(X,p,d)  =  w(o1(X),<l> //p,d),  ...»  g(°B(X)  » <®(X)>  lb  ,d)  if  T(X) 
^initially  <X,p,d>  *  <d, ^,d>  with  d  e  D. 

g  is  strongly  equivalent  to  f  by  application  of  lemma  2.5^ 
with  Y  of  that  lemma  corresponding  to  {<p,d>|  p  a  sequence  of  integers, 
deD^and  y^  corresponding  to  <n,d>  with  deD.  Furthermore  g  has  a  uniform 
inverse  which  is  given  by  the  following: 


VX*p*d*  '  P1 

0  1(X,p,d)  *  <o  (  ...  (o  C o  (d))  ),  p[p  ■*-’  x] » d> 

2  pn-l  pn  * 


23 


The  O*1  function  is  quite  complex,  requiring  recreating  a  sequence  of 
data-structures  starting  with  the  initial  data  structure.  In  practice* 

one  wants  to  construct  a  strongly  equivalent  definition  which  gives  an* 

» 

inverse  but  entails  the  creation  of  an  0“*  function  which  is  simple. 
Simpler,hopefully,than  that  given  in  the  above  theorem.  This  can  often  be 
done.  If,  for  example. 

Corollary  2.1:  For  a  given  recursive  definition  f  c  F  there  is  no  uni¬ 

form  inverse,  but  each  function  o.  e  o  has  an  inverse  =  oT*t  then  the  ; 

•  . » 

definition  for  g  given  above  with  the  third  component  d  deleted  from  its 

arguments  will  serve  with  the  additional  benefit  that  an  alternative  simpler 

•  1  *1  • 
definition  of  0~iX.o)  =  <o7  fX)  ,o  [o,-*“n]>  can  be  used. 

V  J  \ 

This  corollary  can  be  applied  to  the  ’Tower  of  Hanoi'  definition  . 
eX'1.3^  In.  that  example,  oi  c  0  has  an  inverse  for  i  *  1  and  3  but  does  n 
not  quite  have  an  inverse  when  i=  2:  ( 

•  •  . i  .  . r 

Oj  («x,y,z>,n>)  =  «x,z,y>,n+l> 

©2*(<x,y,z>,n)  *  «x,y,z>,A>  vh eye  A  cannot  be  determined  from  <x,y,z>,n 

Oj1(<x,y,z>,n)  *  «z,y,z>,n+l> 

So  first  we  slightly  modify  the  definition  of  f  so  there  will  be  an 
inverse  for  Oj*  Lemma  2.5  justifies  this  simple,  modification  in  which  a 
component  s  is  added  to  store  the  quantity  A  above  when  i  *  2,  and 
otherwise  to  remain  equal  to  0. 


f*(«x,y,z>,n>,s)  =  <x,y> 


if  n  ■  1 


f,(«x,y,z>,n>,s.)  «  f’(«xfz,y>,n-l>,s)^  f'(«x,y,z>,l>,n)^  \ 

'  t'(«z  ,y,x>,n-l>,s)  if  n  >  1 

initially  («x,y,z>,n,>,s)  «  («1, 2,3>,n>,0)»  n  c  N  * 

.  » 

Now  f  is  equivalent  to  f  in  1.3  and  o^  has  an  inverse  for  i  ■  1,2,  or  3. 
These  inverses  are: 

2 

©,  («x,y,z>,n>,s)  ■  «xfz,y>,n+l>,0> 

•  *  •  • 

•1  » 

o2  («x,y,z>,n>,s)  *  «x,y,z>,s>,0>  ; 

-i  i 

Oj  («x,y,z>,n>,s)  =  «zfy,x>,n+l>,0>  ) 

Corrollary  2  now  applies  to  f*.  Its  application  yields  g  below. 

(Some  unnecessary  >'s  and  <'s  have  been  dropped.) 

2 

C e(<x»y.2>.n,s,p)  -  <x,y>  [  if  n  «=  1 

g(<xfy,z>,n,s,p)  *  g(<x,z,y>,n-l,s,<l>^p)*'  g(<x,y,z>,l,n,<2v^e) 

^g(<z,y,x>,n-l,s,<3>^p)  if  n  >  1 

^initially  «x,y,z>,n,s,p>  =  «123>,n,0,X» 

« 

and  the  uniform  inverse  is  given  by 


i0(<x.y.*>.n,s.P)  *  px 

0 r1C<ac,y,z>,n,s,pO  *  <o’1(<x,y,z>,n,s),p[p.^A]> 


25 


Implementation  of  f  e  F  with  Associativity  and  Uniform  Inverse 

We  will  give  a  way  of  implementing  any  f  in  F  which  has  a  uniform 
inverse  and  in  which  w  is  associative.  The  implementation  is  described  by 
a  flowchart  containing,  as  •usual  interconnected  assignments  and  decision 
statements.  The  expressions  in  the  assignment  statements  and  decisions 
are  compositions  involving  the  primitive  functions  and  predicates  w,  o^  e  0, 
m,  q  and  T  and  the  inverses  0’*,  ig  which  enter  the  definition  of  f  e  F. 

In  addition  to  the  above  functions  from  the  definition  of  f,  the  repetoire 
of  flowchart  expression  is  completed  by  an  add  1  function,  a  push  and  pop 
and  an  *  predicate.  There  is  a  storage  cell  X  which  is  assumed  adequate  to 
hold  any  member  in  A^u  u  Df.  Although  there  is.  such  a  push  and  pop, 
the  list  on  which  they  operate,  V,  can  hold  at  most  only  2  members  in  u  w^. 

The  flowchart  which  follows  describes  a  computation  for  each  d  e  D. 

It  is  necessary  to  give  a  concrete  interpretation  of  the  sense  in  which  a 
flowchart  describes  a  computation.  We  imagine  a  traveler  who  starts  by 
entering  block  (0)  of  the  flowchart.  The  traveler  carries  out  the  confu¬ 
tation  described  in  that  block  then, depending  on  the  nature  of  the  block, 

proceeds  to  the  appropriate  next  block.  The  traveler  continues  following 

« 

the  block  instructions  and  proceeding  through  the  flowchart  until  FIN!  is 
reached  completing  the  voyage.  The  value  found  in  V  when  the  traveler  has 
conpleted  the  voyage  is  the  value  computed  by  the  flowchart. 


Flowchart  :  notation  and  assumptions 

In  the  flowchart  we  will  use  the  following  notation.  General: 

(e  is  an  expression) 

X  <  ■■  -  .  e  the  value  of  e  is  assigned  to  X 

V  -.jyffl—  e  the  value  of  e  is  pushed  into  list  V 

X  ''"pop —  V  the  top  member  of  V  is  popped  and  assigned  to  X 

X  *•  -pQ-p —  V[n]  the  top  n  members  of  V  are  popped  and  assigned  to  X 

If  V  is  a  list  *  <v'i»v2*  •••#  vn»  then  w(V)  stands  for  the  expression 

^(VjjVji  •••» 


Primitives  and  their  Compositions:  (Some  of  the  definitions  are  extended 
to  to  make  the  flowcharts  work  if  the  initial  data-structure  is  terminal.) 


Plow  chart 

dotation  Meaning 


FIRST. KID (X) 
f KIDS (X) 

X  ■  TERMINAL? 
PARENT  (X) 
SIBf(X) 
NEXT.SIB(X) 
fSIBS(X) 


OjCX)  ifXti{ 
m(X)  if  X  e  4f;  s  1  if  X  c 
T(X)  if  X  e  Af  u  Qf 
%  O^CX)  if  X  c  Af ;  5  X  if  X  c 
iQ(X)  if  X.t  Af;  ;lif  XcQj 

°SIBf(x5PARENTCX^  if  X  E  Af 

#KIDS  (PARENT  (X))  if  X  c  if;  =  1  if  X  e  Qf 


If  w  is  associative  we  assume  that  there  is  a  member  0W  in  the  range  of 
W  such  that  w(X,0  )  »  X  for  all  X  in  the  range  of  w. 


TERMINAL? 


I 


-28- 


Khen  we  say  a  flowchart  implements  or  realizes  an  f  e  F  we  mean 
that  for  each  d  c  D  the  evaluation  of  the  function.  f(d)  is  *  to  the  value 
computed  by  the  flowchart  with  traveler  starting  at  block  0  and  d  in 
the  flowchart  =  to  d  in  f (d) . 

We  now  present  proofs  that  the  given  flowchart 
does  implement  f  e  F  under  the  appropriate  constraints.  The  proof 

uses  induction  on  the  remoteness  of  the  data-structures 

in  . 


Theorem  2.2:  If  f  t  F  and  f  has  a  uniform  inverse  and  w  in  the  definition 
of  f .  is  associative  then  f  is  implemented  bv  the  flowchart  of 
(figure  2;1). 

4*  .  . 

Proof:  First  we  need  to  show  that  if  A  of  remoteness  nf  block  0  of  the 

flowchart  of  figure  2.1  is  entered  with  A  in  X  and  B  in  V  then 
eventually  the  traveler  arrives  at  0  with  A  still  in 
X.andwith  V  containing 

*[...  wIwlB.fCOjCA^hfCo^A))] . f(°0(A)(A))]  = 

uy  associativity 

'  wiB.wlfCojCA)),  ....  f(oB(A)(A))]]  =  w[B,£(A) ] 

hfe  use  induction.  The  case  when  A  is  of  remoteness  1 
is  easily  verifies  by  tracing  the  flowchart  through  the 
.  sequence  of  blocks  <00  ©  0  (6)  (4)  0>m(A)-l  times  and 
then  through  0000000(f). 


Assume  First  is  correct  if  the  remoteness  of  A  is  <  n.  How 
let  A  be  of  remoteness  n;  X  is  A,V  is  B  and  the  traveler  is 
St  (l).  The  traveler  goes  to  (5)  where  X  becomes  FIRST. KID (X)  = 
Oj(A)  and  the  traveler  returns  to  Since  o^(A)  is  of 

remoteness  <  n,  the  inductive  hypothesis  applies.  Thus  the 
traveler  arrives  at  ©  with  X  being  o^(A)  and  V  =  w[B,f(Oj(X))] 
Oj(A)  cannot  be  initial  because  of  the  inverse  so  the  traveler 
goes  next  to  if  we  assume  now  that 

1  »  SIB# (X)  *  #SIBS(X)  where  X  =  o^A),  the  traveler  will  pass 
through  (4)  and  (5)  updating  X  to  contain  o2(A)  and  then  enter 
0.  By  inductive  hypothesis  again  the  traveler  will  eventually 
arrive  at  @  with  V  containing: 

wCwlB,f(0l(A))],fCo2(A))] 

and  X  containing  (A) .  Assuming  without  loss  in  generality 

that  p  •  m(A),  the  traveler  will  eventually  arrive  at  (9) 

after  p  repeats  of  the  journey  from  to  (9)  with  V  contain- 

ing: 

w[w[B,f(0l(A))],f(o2(A))],  ....  f(op(A))]  = 

wlB.wtfCo^A)),  ...,  f C°p(A)) ]] 

by  associativity  and  *  w[B,f(A)]  by  definition  of  f(A). 

X  contains  o^(A)  at  this  time.  So  the  traveler  goes  to  @ 

where  the  decision  is  yes;  (7)  is  next  with  X  becoming 

its  PARENT, i.e.  PARENT(o  (A))  =  A.  Thus  the  traveler 

P 

arrives  at  (2)  again  with  X  containing  A,  V  still  containing 
w(B,f(A)].  Thus  the  First  result  is  proven.  Now  let  the 


.  traveler  start  by  entering  (o)  thus  setting  X  to  d  and  V 
to  B  ■  0W#  Next  the  traveler  enters  ^  with  these  values 
in  X  and  V  and  so  by  the  First  result-  the  traveler  will 
eventually  arrive  at  with  X  containing  d  and  V  containing 
w[Ow#f(d)]  *  f(d)  by  definition  of  0^. 

As  before  the  proof  is  for  d  e  D  having  remoteness  a  1  and 
is  verified  to  include  remoteness  0  by  tracing  the  flowchart 
explicitly  for  this  case* 

The  necessity  for  a  'uniform  inverse'  as  opposed  to  a  simple  inverse 
in  developing  this  theorem  results  from  the  fact  that  in  the  standard 
form  of  recursive  definition  considered  here  the  number  of  appearances  of 
the  defined  function  symbol  f  is  determined  (=m00)  by  X  the  argument  of  f, 
Hlis  dependence  was  incorporated  so  that  many  common  problems  could  be 
naturally  expressed  in  that  form. 

We  have  not  discussed  the  higher  order  recursive  definitions  hiving 
nesting  on  the  right  -  largely  because  in  our  experience  such  definitions 
rarely  occurred  in  practice.  Suoh  definitions  are  considered  in  [6] .  The 
techniques  given  in  C<3  in  combination  with  those  here  can  be  used  to  extend 
the. above  results  to  higher  order  recursive  definitions  not  covered  in  [6]  • 


Application  of  Theorem  2.2 


Before  applying  theorem  2.S,  it  will  be  useful  to  make  some  notes  on 
flowchart  of  figure  2.1. 

Flowchart  Notes : 

1)  In  general,  X  has  a  number  of  components;  these  components  will 
be  saved  in  identified  storage  locations. 

In  any  assignment  to  X  in  the  flowchart,  only  those  locations 
holding  components  which  are  changed  need  be  assigned. 

Jn  any  decision  on  X  in  the  flowchart,  only  those  component 
locations  need  be  tested  which  are  necessary  to  secure  the 
decision^ 

21_Some  of  the  functions  in  the  boxes  like  NEXT.SIBQC1  in  ©  , 
and  the  decision  in  (4)  are  sufficiently  complex  compositions 
so  that  simplifications  are  often  possible. 

3]_  The  assignment  in  box  ©  is  made  when  data-structure  X  has  the 
property  that  SIB#(X)  =  #SIB(X), 

4J_ Conversely,  the  assignment  in  box  ©  is  made  when  SIB#(X)  f  #SIB(X). 

Beyond  these  generally  applicable  simplifications,  others  are  applicable 
When  the  primitive  functions  in  the  flowchart  have  special  properties. 

The  following  is  important  for  our  example. 

If  v  is  a  union  and  each  q(X)  produced  in  (2)  will  be  different  from 
•11  others  produced  there,  then  boxes  ©  and  ©  may  be  replaced  by  one 
having  the  assignment  OUTPUT  ♦  q(X),  meaning  place  q(X)  in  the  next 
location  in  an  internal  storage  table  or  external  (paper)  table. 


Theorem  2.2  applies  to  many  enumeration  problems.  It  produces  a 
♦good*  algorithm  from  an  easily  justified  recursive  definition.  For 
example,  the  theorem  applies  to  Ex  1,1,  a  recursive  definition  for  enumerating 
binary  numbers.  The  definition  in  Ex  1.1  does  have  an  inverse  -  namely: 

if  a  *  , . .  , otp> 

r  O^Ca.n)  *  <<a[ap  *3#n+l» 

(  i0Ca,n)  =  <yl 

Using  this  inverse  and  the  o,w,m,T  of  ex  1.1,  we  get  the  following  flow¬ 
chart  expression  definitions  i  v 

X  *  <o,n> 

FIRST. KID (<sa,n>)  s  <a//<0>,n+l> 

#KIDSC<a,n>)  =2 

<a,n>  *  TEPMINAL?  2  n=0? 

PARENT (<a,n>)  a  <a[op  X},n+1> 

SIB#(<a,n>)  *  a  +1 

#SIBS(<a,n>)  =2 

SIB#C<a,n>)  =  #SIBS(<a,n>)?  =  a  +1  *  2?  referring  to  note  2  we 

P 

we  can  simplify  in  this  case:  ap  =  1 
NEXT.SIB(<a,n>)  =  <a[ap  Xj//<l>,n>  if  ap  55  0,  which  is  all  it 

*  can  equal  so  simplifying: 

■  <o[ap  <l>],n> 

also  observe,  referring  to  note  1  that  n  is  not 
changed. 

Hie  flowchart  of  figure  2  is  obtained  by  inserting  these  definitions  in  that  of 
figure  1  as  modified  according  to  note  5  which  is  valid  in  this  case. 

It  is  essentially  the  ’add-one*  algorithms  described  in  the  introduction. 


figure  2 


Similarly,  theorem  2.5  can  be  applied  to  ex  2.1,  the  permutation 
definition.  This  definition  also  has  an  inverse,  namely: 

•  if  a  *  <dj,  ...  »Op>,  and  x  =  the  position  (index)  0f  the  integer 
(n+1)  within  a 
•CT*(n,a)  *  <n+l,a[ax  V]> 

*0<n,a)  »x 

*  •  * 

Vost  of  the  flowchart  definitions  result  from  straightforward  substitution 
of  the  inverse  and  the  given  primitive  functions.  For  example: 

SIBf(<n,o>)  *  x 

JSIB(<n,a>)  «  # KI DS (PAP E.NT (< n , a > )  -  m(<n*l,a[a  ♦  X]>) 


-3 


From  which  the  .decision  of  box  4  : 

# 

(SIB#(<n,a>)  ■  #SIBS(<n,a>)?)  s  (x  ■  |a|?)  which  by  the  interpretatio 

of  x  is  equivalent  to:. 

■  C«»p  -  (n+1)?) 

Also  note  that: 

NEXT. SIB (<n,a>)  =  ox+1(<n+l,a[ax  Xj>) 

•  <n,(a[ax  *  l3)|>x+1  ""  n+ll>  which  again,  by  the 

interpretation  of  x  is : 

■  <n»°[°x»ax+i  *"  °x+iax3>  representing  an  interchange 

of  the  (n+1)  component  in 
with  its  right  neighbor 

These  definitions  when  put  in  figure  1  as  follows  produce  an 

algorithm  which  stores  only  one  permutation.  The  next  permutation  is 

always  produced  by  interchanging  components  of  the  currently  stored  one. 

The  resultant  flowchart  is: 


figure  3 


Note  that  this  flowchart  contains  the  assignment:  of  the  'position  of  (n+1) 
in  a'  to  the  variable  x  implying  a  search  for  the  integer  (n*l)  in  a.  A 
modification  in  the  argument  of  the  recursive  definition  can  be  made  which 
will  obviate  this  search.  If  we  add  to  the  argument  a  vector  0  giving  the 
position  of  (n+1)  in  a,  the  'position  of  (n+1} '  will  be  available.  Thus 
modifying  Ex  1.2: 


%  r>  .{<n,o  ,0>|neN^  a  and  0  strings  of  positive  integers} 
*C<n;«,0>l  s  (a)  if  n  »  0 

*C<n,a,0>l  =  fC<n-l,a[a0£  n],<l>^g>l  U  ...  U  f (<n-l,a[ap+ 


X  is  initially  e  {<n,A,X>|ne  N} 


i.f  n  >  0 


TlJ,  cp+l>//B>) 


The  modification  still  has  an  inverse  and  can  be  implemented  by  the 
inclusion  of  a  variable  0  and  following  the  prescription  of  theorem  2.5  to 
obtain  the  following  assignments  to  that  variable  in  figure  2.1. 

V  in® 

6  +  <l>//&  in® 

*  6^!  **  *]  in® 

•  replace  x  by  0^  in® and  after  the  assignment  to  a  put: 

0  ♦  0^  +-  0j  ♦  lj 

•  This  illustrates  how  with  this  approach  one  can  modify  an  algorithm 

% 

focusing  on  the  effect  of  a  change  in  data-structure  without  ever  having  to 
be  enmeshed  in  the  control  structure  of  the  algorithm. 


The  Search  for  the  Inverse 

We  have  been  building  a  system  which  examines  a  recursive  definition 
for  properties  identified  in  this  paper  as  sufficient  to  allow  efficient 


36- 


implementation.  An  important  part  of  this  system  is  a  program  for  determining 
whether  the  given  definition  has  a  uniform'  inverse.  For  a  recursive 
definition  of  the  standard  form: 

Let  the  data- structure  X  be  a  vector  with  n  components,  i.e. 

let  X  ■  <3^,  ...  ,rn> 

Let  the  primitive  O-functions  be: 

OjCX)  *  o(I,X)  *  oCI.Xj,  ...  ,*n)  -  <y1#  ...  ,yn> 

1*®*  °tj]Cl,X]  be  the  j**1  component  of  o(I,X),  thus: 

yi  -  oCiJci,^,  ...  ,.*n) 

set  K:  y2  “  #xn) 

•  « 

%  ~ 
yn  ■  oqoci,^,  ...  »xn) 

-  It -is  easy  to  see  that  the  recursive  definition  has  a  uniform  inverse 
^  there  is  a  unique  solution  of  set  A  for  I  and  each  as  functions 
of  yj  through  yn.  If  the  solution  for  I  is  rCy^  ...  ,yn)  and  for 

xj  * t3^xl »  •••  »X|P»  tben  the  uniform  ig  and  0  *  functions  in  the  uniform 
inverse  are  given  by: 

V/r  'V  ■  rCyi‘  — 

0  ■  Vyl*  ’yT?ft2(yl’  •••  >yi?’ 

$tn^yV  **•  *yn^ 

Currently  we  have  implemented  a  search  for  a  ’simple'  uniform  inverse. 

This  is  described  below. 

step  is  to  set  up  the  equations  of  set  A  for  a  given  recursive 


-37- 


definition  and  then  to  try  to  obtain  a  solution  by  the  following  simple 
procedure. 

1. first  each  equation  in  the  set  is  tested  to  determine  whether  that 
equation  by  itself  can  be  inverted.  Often  the  right  hand  side  of  the 
equation  will  be  only  dependent  on  some  of  the  variables  x^,  ... 
and  I,  and  such  an  inversion  will  be  possible.  For  example,  sayy  on  the 
left  only  depends  on  one  X  on  the  right: 

if  y  *  a  ♦  1  then  x  *  y  -  1,  again  assuming  y  is  a  vector  then 

if  y  ■  <0>//x  then  x  ■  yfy^  *3,  also 

y  *  l  //  x  then  fx  *  y  [y^  ■*-  x3 

(i  .  y 


if 


if 


y  «f< 0>//x  if  i 
,  |<i>d*  if  i 


y  »[<0>//x  if  i  a  i 

2  then 


x  9  ylyi  **• 

It  «fl  if  a  «  0 

l  [2  a  y*  i 


2. If  there  is  no  simple  equation  which  can  thus  be  inverted,  then  the 
conclusion  is  that  a  ’simple1  uniform  inverse  is  not  available.  This  does 
not  mean  there  is  no  uniform  inverse,  but  since  the  computation  of  the 
Inverse  when  it  exists^  will  become  part  of  the  implementing  flowchart  - 
there  is  some  justification  in  limiting  our  search  to  uniform  inverses  which 


are  relatively  simple  to  compute. 

If  one  or  more  equations  can  be  inverted,  obtaining  solutions  for  some 
Xj's  and  perhaps  for  I,  the  solutions  are  substituted  for  those  x^’s  and 
perhaps  I  in  the  other  unsolved  equations  and  the  procedure  returns  to  stCp  ^ 


*38* 


applying  it  to  the  unsolved  equations  with  their  substitutions. 

If  finally  all  equations  are  solved,  the  * simple*  uniform  inverse 
has  been  obtained. 

Simple  as  this  procedure  appears  there  is  still  considerable  difficulty 
in  determining  how  and  if  a  simple  equation  can  be  inverted  when  one  is 
dealing  with  relatively  exotic  functions  such  as  concatenation,  decision, 
insertion,  etc.  which  arise  in  actual  recursive  definitions.  At  this 
stage  even  this  simple  step  is  handled  heuristically  for  a  limited  number 
of  primitive  functions  with  no  guarantee  that  inverses  will  always  be 
produced  when  they  exist. 


References: 


1.  Aho,Hopcroft,  Ullrnan;  The  Design  and  Analysis  of  Computer  Algorithms; 

Addison-Wesley,  197S;  pp  195,222 

2.  Darlington,  J.  and  Bilrstall,  R.M. ;  A  System  Which  Automatically 

Improves  Programs;  Proceedings  Third  International  Joint  Conference 
on  Artificial  Intelligence;  Stanford,  California,  1973;  pp  479-485 

3t  Darlington,  J.  and  Burstall,  R.M. ;  A  Transformation  System  for 
Developing  Recursive  Programs;  JACM;  January,  1976 

4.  Nillson,  N.;  Problem  Solving  Methods  in  Artificial  Intelligence; 

•  McGraw-Hill;  1971 

%  •  , 

-.5*  Strong,  H.R. ;  Translating  Recursive  Equations  Into  Flow  Charts;  Journal 

of  Computer  System  Sciences;  1971;  pp  254-285 

6,  Strong,  H.R,  and  Walker,  S.A, ;  Characterization  of  Flowchartable 

Recursions;  Journal  of  Computer  System  Sciences.;  Vol^  7,  14; 

August,  1973;  pp  407,447 

7,  Pauli,  M.C.;  Formulation  and  Manipulation  of  Enumeration  Based  Algorithms 

Research  Report  SOSAP-TR-4;  December,  1973 

5.  Pauli,  M.C.;  Properties  Which  Allow  Optimizing  the  Implementation  of 

Recursive  Definitions  and  Notes  on  Searching  for  Some  Such  Properties; 
Research  Report  SOSAP-TM-5;  September,  1974 


v  • 


V 


Appendix  I 


Summary  of  Frequently  used  Notation 
If  P  is  a  predicate  the  P  means  not  P. 

N  is  the  set  of  all  positive  integers  =  {1,2,....} 

N  is  the  finite  set  of  integers  from  1  to  n  =  {l,...,n} 
If  A  and  B  are  sets 
Au  B  is  set  union 
A  a  8  is  set  intersection 
A  is  the  complement  of  A 


A  -  B  «  A  A  B 

lAl  =  the  number  of  elements  in  A 

<aj,  ...»  aR>  is  an  ordered  set  or  vector  with  components 


:  i  c  N  and  represents  the  subvector  <a..,a.+lj 


...  ,a.>;  a. , 


1:1 


If  A  and  B  are  ordered  sets  »  <a^,  ...,  an>  and  <b^,  ...,  b^>  respectively 

*>//  B  =  <a1,  •••»  an»  bl*  •••»  bn> 

(A)  is  the  set  of  all  components  in  A 

If  E,  x  and  y  are  each  an  expression,  i.e.  a  string  or  ordered  set  of 
symbols  from  a  given  alphabet,  usually  satisfying  some  constraints  as  to 
form,  then 

E[x  y]  is  the  expression  that  results  when  each  occurrence  of  x  is 
replaced  by  y  in  E. 

The  notation  is  extended  to  allow  the  specification  of  a  number  of 


replacements  E[x  y,  Z  «-  w]  is  the  expression  which  results  when 
each  x  is  replaced  by  y  and  each  Z  by  w  in  E. 


SOSAP-TR-39 


June  1977 


A  PRINCIPLE  USEFUL  IN  THE  DESIGN  OF  MINIMUM  PATH  AND  OTHER  ALGORITHMS 
M.  C.  Pauli 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

New  Brunswick,  New  Jersey 


This  research  was  supported  by  the  Advanced  Research  Projects  Agency 
of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6  to  the 
Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U.  S.  Government. 


Abstract 


Central  to  the  development  of  synthesis  procedures  for  "good” 
Algorithms,  is  the  identification  of  principles  with  sufficient 
generality  to  provide  the  design  basis  for  a  class  of  algorithms. 

We  need  the  simplest  principles  to  cover  the  largest  possible  classes. 
This  requires  the  appropriate  formulation  of  the  class  and  the  state¬ 
ment  of  the  principle  in  terms  of  that  class.  A  principle  which  applies 
to  such  a  class  is  developed  here.  The  class  consists  of  algorithms 
which  solve  sets  of  equations.  It  includes,  for  example,  an  algorithm 
for  the  minimum  path  graph  problem.  The  distinguishing  properties 
of  this  class  are  identified.  The  principle  which  underlies  the 
timewise  efficient  algorithms  of  this  class  is  given  and  justified. 

The  algorithm  thus  developed  for  the  minimum  path  example  is  compared 
with  well-known  alternatives.  Finally,  the  principle  is  applied 
to  another  example:  finding  the  minimum  cost  association  of  matrix 
multiplication,  and  shown  to  provide  an  algorithm  having  advantages 
over  others  previously  described. 


Key  Words  and  Pharses:  algorithm  design,  minimum  path,  complexity, 

equation  sets. 

CR  Categories:  5.24,  5.25,  5.32,  5.42. 


A  PRINCIPLE  USEFUL  IN  THE  DESIGN  OF  MINIMUM  PATH  AND  OTHER  ALGORITHMS 


In  this  paper  we  consider  a  simple  principle,  called  the  Minimum 
Constant  Principle,  which  accounts  for  the  existence  of  efficient 
solutions  in  a  certain  class  of  equation  sets.  Problems  such  as 
finding  the  minimum  path  in  a  graph,  and  the  minimum  cost  order  of 
association  in  a  matrix  multiplication  can  each  be  formulated  as  a 
member  of  this  class. 


The  principle  has  an  interesting  relation  to  the  minimum 
coefficient  principle  which  is  the  name  we  give  to  the  principle  used 
n  Dykstra’s  minimum  path  algorithm^  When  applied  to  the  minimum  path 
problem,  in  fact,  both  yield  virtually  identical  results.  Their 
usefulness  in  application  to  the  minimal  cost  association  of  matrix 
,  multiplication  problem,  however,  is  quite  different. 


Consider  the  variables  Xj ,  ...  ,X^  which  are  to  take  values  in  a 

set  D  of  positive  numbers  and  which  are  related  by  the  following  set 
of  equations: 

% 

1)  =  min(Tllt  ...  »T1p,B1)|l  <  i  <  n} 

in  which: 

Tjj  is  a  function  whose  arguments  are  some  subset  S 
of  the  variables  X^  through  Xn  and  whose  value  is 
always  greater  that  that  of  any  of  those  variables. 

'  min(A,B)  =.  the  smaller  of  A  and  B 


# 


Such  an  equatlo.n  set  is  called  minimum-monotonic 


The  Minimum  Constant  Principle  is  easily  derived 
for  such  a  set  of  equations. 

Take  the  minimum  of  both  sides  of  1): 

ain(X  ...  )  —  min(T  ...  ,T^p,B^, 

T21»  •**  »T2p,B2» 

f 

Tni»  ,Tnp’ Bn^ 

.  _  Since  minCXj,  ...  ,Xn)  s  some  XA  (1  <.  i  <.  n) 
smallest,  and  since  each  >  than  at  least  one  X^  (1  <_  i 
minCX^,  ...  , Xn)  £  for  any  1  <_  i,j  <_  n  thus 

min(Tj, ^  ,  ... 

^  T21»  **.  ’ T2p’B2* 

Tnl*  •••  »Tnp’Bn)  £  Tij  for  any  1  <.  i , j  n 
m  in  C  Xj ,  ...  ,  Xjj)  —  min(  B^,  ...  ,  Bjj) 

and  if: 

min(Xr  ...  ,Xn)  =  Ba 

then: 


namely  the 
<  n) 


Therefore : 


and  since: 


Therefore  for  a  minimurn-monotonic  equation  set,  we  have  the 
minimum  constant  principle: 

Theorem  1:  if  mintD^,  ...  tBn)  =  then  X„  r  na 

Algorithm  Incorporating  The  Minimum  Constant  Principle 

The  principle  established  by  this  theorem  can  be  made  the  basis 
of  an  algorithm  for  solving  a  set  of  minimum  monotonic  equations. 

The  algorithm  is  an  adaptation  of  Gaussian  Elimination.  At  each 
step  a  new  set  of  equations  is  formed  having  one  fewer  equation  than 
at  the  previous  step  and  involving  one  less  variable.  The  choice  of 
the  *  order  of  elimination  is  determined  by  the  Minimum  Constant 
Principle.  The  step  will  be  indicated  by  the  count  k  and  the 
variables  and  constants  as  they  appear  in  the  equations  formed  in  that 
step  will  be  indexed  by  k.  Thus  B^  is  the  designation  of  the 
constant  term  in  the  equation  for  X^  formed  in  step  k.  U(k)  is  the 
set  of  variables  which  have  been  used  -  or  eliminated  in  steps  1 
through  k,  A(k)  are  those  still  active. 

With  k  initially  =  1,  U(0)  =  the  empty  set  ‘and  A(0)  =  N  =  {1,2, 

...  ,n],  the  algorithm  to  solve  for  is  given  as  follows: 

% 

Minimun  Constant  Algorithm: 

Loop:  Find  a  min{B|^  |  ie  A(k-1)] 

(k-1 ) 

Say  it  r  Bj(k),i.e.  the  index  of  the  minimum  of 

the  constant  terms  is  I(k). 

(k-1 ) 

Record  that  Xj^)  equals  Bj^)  • 

If  I(k)  =  d,  then  Xj^j  =  xd  and  the  alSorithm 

^  This  theorem  will  still  hold  if  T. ^  is  only  required  to  be  greater 
than  at  least  one  variable  of  which  it  is  a  function.  However, 
unlike  the  property  given  this  property  docs  not  necessarily  persist 
when  a  constant  replaces  a  variable  in  T. ..  This  persistence  is 
necessary  for  consistent  use  of  the  principle  in  the  Minimum  Constant 
Algorithm. 


terminates;  otherwise 
(k-1 ) 

Substitute  for  all  occurrences  of  X 


r(k-l) 


I(k) 


in  each 


with  i  e  A(k)(which  is  =  A  ( k—  1 )-  tl  (k )}  )  and  j<j?>. 


(k-1) 

Replace  the  constant  term  in  each  equation  with 

the  minimum  of  that  term  with  any  new  all  constant 

terms  formed  as  a  result  of  the  substitution. 

00 

The  new  constant  is  designated 

The  resulting  equations,  excluding  that  for 

form  the  k-th  set. 

Return  to  Loop. 


If  the  equations  are  to  be  solved  for  all,  rather  than  for  just 
one  variablei  the  Loop  can  be  continued  n  times,  with  a  new  variable 
value  determined  each  time  through  the  Loop.  The  check  for  I(k)  =  d 
'-can  be  eliminated. 

It  is  easy  to  show;  primarily  because  we  are  able  always  to 

substitute  a  constant  rather  than  an  entire  expression,  involving 

2 

variables,  in  eliminating  variables;  that  this  is  an  o(n  )  algorithm. 
Tnis  is  so  in  solving  in  this  way  for  any  number  of  variables. 


lication  of  Minimum  Constant  Principle  to  Minimum  Path  Problems 


nd  Comparison  With  Alternative  Algorithms 


The  problem  of  finding  the  minimum  cost  path  between  node  1  and 
node  n  in  a  directed  graph,  G,  with  n  nodes  =  N  =  (1,2,  ...  ,n>  ,  and 
v>sitive  non-zero  branch  costs  ,  can  be  formulated  as  a  problem  of 


solving  a  set  of  equations.  Let  be  interpreted  as  the  minimum  cost 
to  reach  node  n  from  node  i(thus  Xn  =  0).  Let  be  the  cost  of  the 

branch  from  node  i  to  node  j,  which  is  a  given  of  the  problem.  If 
there  is  no  such  branch  then  a^  =  ©o.  If  a  branch  exists  from  i  to 

J,  then  j  is  an  ‘outward  neighbor'  of  i,  and  i  an  ‘inward  neighbor'  of 
J.  The  following  equation  set  then  represents  a  legitimate  set  of 
relations  amongst  the  nodes  of  G. 

PATH1 :  tXj  =  . aln+Xn,  co)  |  i£  Tn- (n  J> 

xn=  0 

This  expresses  the  fact  that  the  cost  of  the  minimum  path  from 
node  i  to  node  n  is  determined  by  finding  the  minimum  cost  from  each 
outward  neighboring  node  of  i  adding  the  cost  to  reach  that  neighbor, 
and  then  minimizing  this  sum  over  all  neighbors. 

There  is  also  a  dual  formulation  of  the  problem. 

Let  Y^  be  interpreted  as  the  minimum  cost  to  reach  node  i  from 
node  1(thus  =0).  Let  a  be  as  defined  above. 

PATH  1 ' :  Yx  =  0 

{Y1  =  min(au+Ylt  ...  ,  a^+Y^j  ,«»)  |  ie  N-l  1 }  ) 

This  expresses  the  fact  that  the  cost  of  the  minimum  path  to  a 
node  i  from  node  1  is  determined  by  finding  for  each  inward  neighbor 
of  node  i  the  minimum  cost  to  reach  it  from  node  1,  adding  the  cost  of 
the  branch  from  that  neighbor  to  i,  and  then  minimizing  this  sum  over 
all  such  neighbors. 


For  a  given  graph  problem  the  solution  for  X^  in  the  PATH1  set  of 
equations  should  equal  the  solution  for  Yn  in  the  PATH)'  formulation. 


Either  set  of  equations  may  be  solved  .by  the  Minimum  Constant 

2 

Algorithm  and  is  thus  of  o(n  )  complexity.  (The  application  of  this 
algorithm  to  PATH1'  will  be  described  in  considerable  detail  shortly.) 

A  widely  favored  alternative  for  solving  such  minimum  path 
problems  is  Dykstra's  Algorithm.  When  interpreted  as  a  procedure  on  a 
set  of  equations,  a  major  source  of  the  algorithms  strength  is  seen  to 
come  from  the  application  of  the  ’Minimum  Coefficient  Principle’ . 

An  interesting  relation  exists  between  the  algorithms  embedding 
the  Minimum  Coefficient  and  Minimum  Constant  Principles  when 
respectively  applied  to  a  minimum  path  problem  and  its  dual. 

To  establish  this  relation  we  first  will  translate  Dykstra’s 
Algorithm  as  it  is  applied  to  the  minimum  path  problem  formulated  as  a 
set  of  equations.  In  particular,  consider  the  PATH1  equations. 


In  Dykstra's  Algorithm  one  keeps  a  list  of  the  current  minimum 

cost  to  reach  each  node  j  from  the  starting  node  1.  In  equation  terms 

one  works  with  an  equation  for  X^  which  is  stepwise  developed  by 

substitution  for  variables  on  its  right.  The  cost  to  reach  each  node 

00 

from  1  is  designated  ajj  after  the  kth  iteration  of  the  algorithm. 


(0) 


Initially,  k  =  0  and  a^ 

(0) 


is  the  cost  associated  with  the  branch 
<i,j>.  [Thus  air  =  aij the  set  of  equations  formulation  of  the 
problem.]  Also  there  is  the  set  A(k)  of  all  nodes  not  yet  solved. 


Initially  A(0)  =  [2,-  ...  ,n)  .  On  the  kth  iteration  one  finds  the 

inimum  of  (  ^|j eA( k-1  )J  and  determines  the  index  of  that  term. 

Call  this  index  I(k).  The  set  of  active  nodes  or  indices,  A(k),  can 

now  be  updated.  A(k)  =  A(k-1)  -  {I(k)J.  Next  if  I(k)  /  the 

destination  node,  n  then  for  each  node  j  which  is  an  outward  neighbor 

of  I(k)  and  in  A(k) ,  the  cost  of  branch  <I(k),j>  is  added  to  the  cost 

of  reaching  I(k)  from  1,  *  Note  that  as 

indicated  is  the  coefficient  of  X^  on  the  right  of  the  equation  for 

*1(10  .  For  each  j  this  sum  is  compared  with  the  previous  minimum 

fk-ll 

AAV  f  f  A  AA  A  A/4  A  1  A*  '  A  A/4  f  ml  A  r  f  I.IA  V\AA  AIM  A  r>  f  U  A 


cost  to  reach  node  j, 
current  minimum: 


,  and  the  minimum  of  these  two  becomes  the 


,00  - 


,(k-l) 


=  mint  a^  »an(k)  +  aI(k)J1 

This  is  equivalent  to  duplicating  the  Xj^j  term  on  the  right  of  the 
current  equation  and  then  in  that  duplicated  term  substituting  for 
^he  variable  Xj(kj  on  the  right  of  the  Xj^kj  equation  and  gathering 
terms. 


Now  the  k+lst  iteration  is  undertaken 


If  I(k)  =  the  destination  node,  n,  the  algorithm  terminates  with 

the  minimum  cost  of  reaching  n  from  the  start  node  1  (=Xj  in  the 

equation)  being  given  by  aij"(k)  =  ain-1^» 

Summary:  * 

(0) 

ail  ~  alJ  ~  cosk  °f  branch  from  node  i  to  node  j 
I(k)  =  minindex({ | jCN) ) 

where  minindex( ( A^ ,  ...  =  smallest  j 

such  that  Aj  =  minimum(A^f  ...  •V 

a00  -  min(a(k“1)  a^D+Jk-D) 

'  alj  “  min(alJ  ,all(k)+al(k)  J ' 

•  • 

if  I(k)  =  n.  answer  =  a.  * 


In  this  algorithm,  the  fact  that  one  can  always  substitute  for 

the  variable  on  the  right  of  the  current  X^  equation  which  has  the 

smallest  coefficient  and  thus  arrive  at  the  solution  when  X  on  the 

n 

right  has  the  smallest  coefficient  is  referred  to  as  the  Minimum 
Coefficient  Principle.  This  principle  seems  to  require  that  the 
equations  be  minimum  monotonic  as  defined  on  page  1  and  further  that 
the  terms  in  that  definition  be  restricted  to  a  single  variable. 

For  comparison  consider  the  application  of  the  Minimum  Constant 
Algorithm  to  the  minimum  path  problem.  We  will  apply  this  algorithm 
to  the  dual  of  the  problem  to  which  we  applied  Dykstra's  Algorithm  - 
namely  to  its  formulation  in  PATH1*  .  For  comparison  we  will  try  to 
develop  this  application  of  the  algorithm  in  the  same  notation  used 
above  for  Dykstra  Algorithm. 

In  order  to  show  the  relation  of  this  algorithm  to  the  previous 
one  the  initial  step  has  to  be  separated  from  the  remainder  of  the 
algorithm.  is  the  constant  terra  in  the  i-th  equation. 

Initially  =  0  for  i  =  1 ;  B^  =  oo  for  2<i<n 

So  minindexd  B^  |  ic  n)  )  =1. 

Y1  s  °* 

% 

The  result  of  substituting  0  for  Y^  in  the  terms  containing  Y^  in 
the  equation  for  Y^  ,  2  <_  i  <n ,  is  to  make  that  term  equal  to 

®+ali  =  all *  This  constant  is  minimized  with'  the  existing  constant 
term,  which  is  «>(  in  each  of  these  equations  giving  a  new  constant 
term  designated  =  a|^in  the  i-th  equation. 


After  this  step,  which  is  not  counted,  the  sjt  of  equations  to  be 
solved  for  Yn  then  is: 

Y2  s  °22  +Y2  *  ®32  +Y3  *  ,a12^ 

s  mintag^+^f  •••  »^x3+^n,a13^ 

•  ^ 

: 

Yn  =  min(a2n+Y2,  ...  »^m+Yn»aln^ 


Let  A(0)  =  C 2 ,  ...  ,n)  ,  and  ,  also  called*  in  this  case 

s  ^as  *'*le  aPP^ication  of  the  Minimum  Coefficient  Principle)  . 

Start  with  k  s  1. 


On  the  k-th  iteration  the  Minimum  Constant  Algorithm  calls  for 

determining  the  minimum  index  of  all  the  constant  terms, i.e.  of 
(k-1)  (k-1) 

Bj.  s  ali  icA(k-1)  .  Let  that  index  =  I(k).  Then,  because  of 
-theorem  1 : 

YI(k)  =  BIU-1)S  alli(kj 

A(k)  =  A(k-1 )  -  I(k) 

If  I(k)  /  n,  the  destination  node,  then  at  the  next  part  of  the  k-th 
(k-1) 

iteration  aj[j^lcj  is  substituted  for  each  occurrence  of  in  all 

equations  for  Yj  with  j«A(k).  The  term  involving  in  the  j-th 

equation  then  becomes  aiI(kj+  ai(k)J-  The  minimum  of  this  and  the 

(k-D 


current  constant  term  a 
constant  term, i.e.: 


1J 


in  the  j-th  equation  becomes  the  new 


B(k)  _  a (k)  .  m,nra<k-l)a<k-l>  (k-1). 

B  -  aij  -  minCajj  ai(k)j3* 

Mow  the  k+1st  iteration  may  be  started. 


If  I(k)  =  n,  the  destination  node,  then  the  algorithm  terminates 
Mlth  *„  -  *55 1  =  a the  answer. 

After  the  initial  step,*  this  algorithm  may  be  summarized  by  the 
same  set  of  relations  as  given  in  the  Summary  for  Dykstra’s 
Algorithm.  Therefore  in  application  to  the 

problem  of  finding  the  cost  of  the  minimum  path  in  a  positive  weighted 
di-graph,  we  have  shown  that  the  Minimum  Constant  Algorithm  applied  to 
one  formulation  of  the  problem  follows  a  virtually  identical  course  to 
that  of  Dykstra’s  Algorithm  applied  to  the  dual  of  that  formualtion. 


These  two  minimum  path  algorithms  may  therefore  be  considered 
really  the  same  though  the  principles  on  which  they  are  based  differ 
considerably. 


So  far  we  have  considered  only  the  cost  of  the  minimum  path  and 
Tiot  the  computation  of  the  path  itself.  That  computation  for 
Dykstra’s  algorithm  is  well  known.  For  completeness  we  will  briefly 
sketch  the  analogous  computation  for  the  Minimum  Constant  Algorithm 
based  on  its  application  to  PATH1 ' . 


The  path  can  be  computed  by  a  kind  of  ’Hansel  and  Gretel’ 
principle.  Whenever  the  constant  term  of  an  equation  say  the  j-th  is 

changed  during  the  k-th  iteration  this  occurs  because  +  a 

flc-l  j 

<  ajjj  .  When  that  happens  the  value  of  Y^  is,  as  far  as  is  known  up 
to  this  iteration,  directly  dependent  on  Yj^j  as  it  occurs  on  the 
right  side  of  the  Y^  equation.  A  vector  NXT  with  n  entries  may  be 
kept  with  NXTCj]  set  to  I(k)  to  record  each  occurrence  of  the  event 
described  in  the  previous  sentence.  When  the  computation  of  the  cost 
%>f  the  minimum  path,  Yn,  is  completed  the  minimum  path  will  be 


ruge  1 1 

computable  from  this  trail  through  the  right  side  of  the  equations  in 
the  NXT  vector.  That  path  is  =  <1,  ...  ,NXT(k),k,  ... 
,NXT(NXT (n) ) , NXT(n) ,n>. 


Another  Application  of  Minimal  Constant  Principle 

The  Minimum  Constant  Principle  will  now  be  applied  to  a  problem 
discussed  in  reference  1  as  an  illustration'  of  the  'dynamic 
programming'  approach  to  algorithm  design.  The  problem  is  to  develop 
an  algorithm  for  deciding  which  of  the  equivalent  ways  of  associating 
the  multiplication  of  a  set  of  n  matrices  will  result  in  the  fewest 
number  of  multiplications  of  matrix  components.  For  example,  if  M  ^ 
and  are  both  3  by  7  matrices  and  M 2  a  7  by  3  matrix  then  the 
multiplication  x  M2  x  can  be  performed  in  the  order  given  by  the 
following  association  ((Mj^  x  M2)  x  M^)  with  a  cost  of  3  x  7  x  3  s  63 
component  multiplications  for  (M  ^  x  and  then  3x3x7=63 

component  multiplications  for  multiplying  the  3x3  result  of  ( x 
M2>  with  the  3  by  7  matrix  M^.  Thus  this  association  costs  126.  On 
the  otherhand  (M^  x  (M2  x  M^) )  costs  210  multiplications.  These 
calculations  use  the  fact  that  the  cost  =  the  number  of  component 
multiplications  required  to  multiply  the  rR  by  ea  matrix,  M^,  and  the 
by  cb  matrix,  Mb,  s  rfc  x  (ca=  rR )  x  cb. 

In  general,  we  are  confronted  with  the  multiplication  of  n 
matrices  Mj  x  M2  x  ..  .  x  M^,  with  having  the  dimensions  r^  by  c^ . 
Let  Xjj  (j  >.  i)  be  interpreted  as  the  minimum  cost,  over  all  possible 


associations,  to  multiply  x  ^  x 


x  M  then  the  relations 
J 


between  the  X  . .  f or  a  given  problem  involving  n  matrices  is: 


o  w 


•  c. 


MATRIX1: 


=  0 

s  Min(a 


ljk  +  *1 ,  i+k 
k»0  to  J-l-1 


Xi+k+lf  J  ) 


if  J  =  i 
if  j  >  i 


where  a.  ..  is  the  cost  of  multiplying  the  result  of  M  x  ...  x  H 

1JK  1  1+fc 

by  the  result  of  M1+k+1  x  ...  x  =  rA  x  (c1+k=  r1+k+1 )  x  Cj. 

This  relation  represents  the  fact  that  the  multiplication  x 

**1+1  x  Ml+2  x  •••  x  **j-i  x  MJ  can  be  finally  d°ne  with  one  of  the 
following  associations: 

Mi  x  (  x  mi+2  x  •••  x  x  Mj) 

(M^  x  M1+1>  x  (M1+2  x  •  ••  x  x  Mj) 

(H^  x  M1+1  x  M1+2  x  ...  x  x  Mj 

The  *  dynamic  programming’  solution  of  reference  1  to  this  problem 
formulated  as  a  procedure  for  solving  the  MATRIX  equations  is  given  as 
follows. 


The  X^j’s  are  arransed  according  to  the  difference  j-i  =a.  For 
A  «  0,  X..  s  0.  For  A  =1 ,  all  X. .  *s  only  depend  on  X  ’ s  for  which  A 

lj  ij 

s  0,  and  may  be  solved  immediately.  In  general  all  X  with  A  =  d 

*  v 

depend  only  on  X^j's  with  A  <  d,  and  thus  may  be  solved  when  these  are 
known.  Then  the  computation  can  be  summarized: 

On  the  d-th  iteration  for  every  X^^  with  n  >  j  >  i  >  1  and 

J-i  s  d,  compute  X^  «  MinCa^  +  xi(i*k  +  xi+k+l,J5,  The  last 
iteration  is  for  d  =  n-1.  X^n  is  the  answer. 


r«tjc  lj 


This  algorithm  involves  computing  the  value  of  n-d  variables  X.  . 

•  J 

with  j-i  =  d.  The  computation  of  each  variable  X^j  with  j-i  =  d 
requires  taking  a  minimum  of  d  terms.  The  computation  time  for  each 
term  is  independent  of  the  number  of  matrices^being  multiplied.  It 
depends  only  on  the  size  of  the  matrices.  Let  it  =  K.  Thus  the 
maximum  time  is  given  by: 

K**I(n-d)»d  which  is  o(n^) 

Note  that  when  the  computation  is  done  this  way  the  average  time 
s  the  maximum  time. 


The  minimum  cost  can  also  be  obtained  by  the  application  of  the 
Minimum  Constant  Principle  to  the  set  of  equations  -  MATRIX.  An 
example  will  illustrate  this  approach  and  highlight  its  differences 
from  the  previous  approach.  Consider  the  example  of  multiplying 
aatrices: 

Mj  x  Mg  x 

with  M^  and  M^  being  3  by  7  matrices  and  Mg  being  a  7  by  3  matrix. 
The  equations  for  this  case  are: 


X13  =  min^ai3o  +  xn  +  x23»al3l  +  X12  +  x33*0‘5) 

X12  =  min^a120  +  X11  +  X22’°°^ 

X 23  =  minCag^Q  +  x22  ^ ^31^ 

xu  =  o 

X22  =  0 

*33  =  0 

where  is  a  r^  by  c^  matrix  and  a,.k  =  r.x  c.  k  x  c.  . 


The  minimum  constant  is  0.  xu»X2?. *  an<*  ^33  have  this  value. 

Substituting  for  X.,  first  leaves  the  minimum  constant  =  0,  with  "k 

••  *  C  C 

-'and  X still  having  this  value.  X22  and  X^j  are  then  similarly 
substituted  for  after  which  the  remaining  equations  will  be: 

Xj3  =  min( a^^o4,  X23,a131+  X12,e°^ 

X|2  ®  min( a^20^  s  ^3 

s  min(  ^230^  =  ^ ^7 

The  minimum  constant  now  is  63  with  X^g  already  having  that 
value.  Substituting  for  Xj^  next  then  leaves: 

Xj3  s  min(  a^o  +  +  63) 

•  =  minCa^o  +  ^23*  ^  +  63) 

—  min(a^0  +  1 126) 

=  147 


The  minimum  constant  now  is  126.  It  is  the  constant  term  in  the 
*13  equation  so,  by  the  minimum  constant  principle,  X  =126  and  this 
is  the  answer. 


This  answer  was  obtained  without  ever  substituting  for  X^  •  In 
“the  *  dynamic  programming’  algorithm  this  would  have  had  to  have  been 
done.  On  the  otherhand ,  in  this  algorithm  it  is  necessary  to  find  the 

minimum  of  all  the  constants,  once  for  each  of  the  substitutions  and 

•  » 

to  keep  a  record  of  some  partially  solved  equations. 

A  matrix  multiplication  algorithm  incorporating  the  Minimum 
Cbnstant  Principle  simulating  the  above  process  will  now  be  given. 
There  is  a  matrix  M.  An  entry  matrix  M(i,j,k)  indicates  the  the 
current  state  of  X.  .  .  and  X.  .  ,  .  ,  the  two  variables  which  appear 
In  the  k-th  tevm  on  the  right  of  the  X^^  equation.  The  significance 


Pago  15 


of  the  entries  in  M(.i,j,k)  are  given  as  follows: 

M(i,J,k)  s  0  if  neither  X  .  .  nor  X. 


i,i+k  nor  Xi+k+l,J 


are  known 


■  1  if  either  or 

are  known 


Whenever  a  value  for  some  variable  Xft^  becomes  known,  all  X^j 
equations  which  have  X^^  on  the  are  effected.  That  effect  is 

recorded  by  updating  the  matrix  M  as  follows: 

M(  a ,  j ,a-b)  :=  1  for  n  >  j  >  b 
M(i,b,a-i-1)  :=  1  for  a>  i  >  1 
if  those  entries  were  0. 


If  an  entry  M(i,j,k)  already  =  1  is  to  be  updated,  then  + 
*i+k+l  J  +  aijk  *s  computed  using  values  stored  in  a  second  matrix 
CON.  This  is  easily  done  since  C0N[i,j)  holds  the  current  constant 
term  of  the  equation  for  Xj^,  and  under  the  conditions  stated,  Xj^+fc 
and  ^  must  both  be  in  CON.  Furthermore,  a^^  depends  only  on 
the  given  dimensions  of  the  matrices  involved.  Next  this  quantity  = 


C0N(i , i+k)  +  C0N(i+k+1 , j) 


is  compared  with  the  current  value  of 


C0N(i,j).  The  minimum  becomes  th"  new  value  of  C0N(i,j) ,i^e. : 


C0N(i  ,  j  ,k)  :=  min(a1jk+  C0N(i,i+k)  + 

c6N(i+k+1  ,j)  ,C0N(i,j))- 


In  order  to  identify  the  constants  over  which  the  minimum  is 
still  to  be  taken,  all  those  entries  in  CON  which  are  neither  <*>(  their 
initial  value) ,  nor  have  already  served  as  a  minimum  constant  are 
linked  together. 


The  number  of  rows  of  the  ith  matrix  to  be  multiplied  is  in  R[i] 
and  the  number  of  its  columns  in  C(i]. 

Minimum  Constant  Algorithm  for  Minimum  Cost  of  Associating  Matrix 
Multiplications: 

Initialize: 

H  =  0 
CON  r 

while  A,B  /  1,n 

Find  the  minimum  of  all  linked  entries  in 
constant  matrix  CON. 

Put  these  indices  in  A  and  B. 

Remove  CON[A,B]  from  the  linked  list. 
for  J  =  B+1  to  N 

if  MCA , J , B-A ]  =  1  then  UPDATECA, J, B-A] 
else  MCA, J, B-A]  :'s  1 

end 

for  I  s  1  to  A-1 

I£  M[I,B,A-I-1]  =  1  then  UPDATECI . B.K] 
else  M[I , B,K]  :=  1 

end 

end 

procedure  UPDATECI, J,K] 


CONCI, J]  :=  minJ[RCl]xCCl+K]xCCJ]+CONCl,I+k]  + 

♦C0NCI+K+1 , J] , CON Cl , J] ] 


Page  17 

*»'  ■ 

We  have  not  considered  the  actual  order  of  association  which 
/ould  achieve  the  minimum  cost  computed  by  the  previous  algorithm.  A3 
in  the  minimum  path  example,  the  actual  association  could  be  inferred 
by  using  the  Hansel  and  Gretel  Principle.  This  would  involve 
incorporating  into  the  algorithm  the  facility  to  record  which  term  on 
the  right  of  each  equation  that  used  solved  in  finally  solving  for 
Xjj  .  This  information  would  be  sufficient  to  reconstruct  the  optimal 
association  of  the  matrix  . 

.It  is  easy  to  see  that  this  algorithm,  like  the  previous  one,  is 
o(n^)  for  the  maximum  time  despite  requiring  a  minimum  operation  not 
required  there.  Its  advantage  is  in  being  able  to  produce  a  solution 
in  considerably  less  than  that  maximum  time,  as  illustrated  by  the 
example. 

Though  the  above  algorithm  is  of  interest  on  its  own,  one  of  our 
objectives  is  to  compare  it  with  one  which  embeds  the  minimum 
coefficient  principle.  There  is  difficulty  in  doing  so  however.  Each 
terra  on  the  right  of  an  equation  can  involve  up  to  two  variables. 
Therefore  after  having  identified  the  minimum  coefficient  it  may  not 
be  clear  which  of  two  variables  is  to  be  substituted  for.  Also  - 

after  substitution  and  rearrangement  of  the  right-side  of  an  equation 

* 

into  a  maximum  of  a  set  of  terras,  each  being  a  sum  of  variables  and  a 
constant  -  the  same  variable  may  appear  in  more  than  one  term. 
Furthermore,  to  show  the  relation  of  the  Minimum  constant,  and  Minimum 
Coefficient  Principles  in  the  case  of  the  minimum  path  problem  it  was 
necessary  to  consider  dual  formulations  of  that  problem.  Here  it  is 
not  clear  what  a  dual  formulation  would  be. 


Though  it  may  .  still  be  possible  to  generalize  the  Minimum 
(pefficient  Principle  so  that  it  is  applicable  to  thi3  problem,  it 
appears  that  any  equivalence  with  the  algorithm  above  thus  established 
oust  be  of  a  substantially  different  character  than  that  established 
for  the  minimum  path  problem. 


References : 

1.  Aho,Hopcroft,  Ullraan;  The  Design  and  Analysis  of  Computer  Algorithms; 
‘  Addison-Weslev.  1975:  t>n  195.222 


S0SAP-TR-40 


July  1977 


THE  MIN-MAX  BRANCH  IN  A  GRAPH- -AN  APPLICATION  OF  THE  MINIMUM  CONSTANT 
PRINCIPLE 

M.  C.  Pauli 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

New  Brunswick,  New  Jersey 


This  research  was  supported  by  the  Advanced  Research  Projects  ,'.eency 
of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6  to  the 
Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U.  S'.  Government. 


THE  MIN-MAX  BRANCH  IN  A  GRAPH  -  AN 
APPLICATION  OF  THE  MINIMUM  CONSTANT  PRINCIPLE 

I 

I 

In  a  previous  paper  or  we  described  a  principle  which  formed 
the  basis  of  a  ‘good’  algorithm  for  solving  certain  equation  sets. 

|  This  is  the  Minimum  Constant  Principle.  In  this  paper  an  extension  of 

that  principle  is  given,  and,  as  an  example,  applied  to  the  problem  of 
finding  the  minimum  of  all  branches  which  are  themselves  each  maximum 
|  on  some  path  between  two  specified  nodes  in  a  given  graph. 

In  general,  the  (extended)  Minimum  Constant  Principle  is 
^'applicable  to  the  solution  of  equation  sets  forming  a  class  we  call 
(extended)  Minimum-Monotonic.  Subsequently  we  drop  ’extended’  with 
the  understanding  that  all  terms  refer  to  their  modified  definitions 
|  as  given  here. 

A  Minimum-Monotonic  equation  set  has  the  form: 

I.  fXj  =  min(Tu . Tlmi,Bi)lieNl 

1  where  N  =  [l,2,  ...  ,n}?  ieN  are  variables;  B^, 

i*N  are  constants;  for  each  j,  T^j  is  a  function  of  a 
subset  of  the  variables,  X^»  i£N.  Over  the  entire 
range  of  legitimate  values  of  its  variables  the  value 
of  T1J  is  >  the  value  of  any  of  its  variables.  (Our 
earlier  definition  £f]  was  more  restrictive  having  a  > 

I  "% 

-■  in  place  of  >  above.) 


£il  refers  to  the  i-th  reference  listed  at  the  end  of  the  paper. 


> 


Page  2 


An  algorithm  will  now  be  described  which  gives  a  solution  to  such 
a  set  of  equations.  A  solution  is  a  set  of  values  for  the  variable 
1<i<n.  When  these  values  are  substituted  for  the  corresponding 
variables  in  I  and  the  result  evaluated,  equal  values  will  appear  on 
the  left  and  right  of  each  equal  sign.  The  solution  produced  by  the 
algorithm  may  not  be  the  only  one.  Other  sets  of  X^  values  will  also 
give  equality  in  1.  This  solution  will,  however,  be  the  maximum  one. 
In  no  other  solution  will  any  variable  have  a  higher  value  than  it  has 
in  this  one. 

The  algorithm  will  transform  the  set  of  equations,  initially,  in 
its  0-th  version  in  the  form  I,  in  a  number  of  similar  steps,  through 
a  number  of  versions,  until  for  all  1<i<n  the  equations. with  X^  on  the 
left  has  only  a  constant  on  its  right. 


Algorithm 1( Minimum  Constant  Algorithm) 

0(1)  =  N  =  {1,2,  ...  ,n] 

do  for  k  s  1  to  n 

Find  the  minimum  of  all  constant  terms  in  the  equations  for 
leU(k). 


Let  I(k)  be  the  index  of  this  minimum  constant  term.  .  * 

(k-1 ) 

Let  Xjfk)5  Bj^)  replace  the  equation  for  in  the  k“th 

version  of  the  equation  set. 

(k-1) 

Substitute  for  all  occurrences  of  Xj^jon  the  right  of 

equations  in  the  (k-l)-st  version  of  the  equation  set  -  gather 

He) 

terms  on  the  right  leaving  one  constant  term  designated  Bj  on 
the  right  of  each  equation.  These  new  equations  together  with 
any  k-1  equations  unaffected  by  these  changes  become  the  k-th 


Page  3 


equation  set,  with  their  constant  terms  designated  B£  ,  ieN. 
Set  U(k)  to  U(k)-l(k) . 


Note  that  in  the  final  set  of  euqations: 


’i(i)  <-xiO+i)ror 


This  algorithm  involves  the  following  two  operations  of: 

1)  Substituting  a  constant  into  a  term  then  evaluating 
that  term;  then,  if  the  evaluation  gives  a  constant,  taking 
the  minimun  of  that  term  and  the  existing  constant  term. 

This  must  be  done  at  most  ^n-1 )  =  n(n-1)/2  times. 

Assuming  each  part  of  this  operation  takes  a  constant  time 
independent  of  n,  the  cost  of  the  worst  case  is  proportional 
to  this  time; 

2)  Taking  the  minimum  of  n,  then  of  n-1  constants,  etc.  These 

require  at  most  n-1,  n-2,  etc  operations,  costing  at  most  a 

total  of  n(n-1)/2  basic  minimum  operations. 

2 

So  the  algorithm  is  o(n  ),  which  is  good,  particularly  if,  as  claimed, 
the  values  of  X^  computed  by  the  algorithm  do  in  fact  satisfy  the 
given  equation  set.  This  remains  to  be  verified.  The  given  set  of 
equations  may  not  even  have  a  unique  solution.  Nevertheless,  as'  will 

(k-1) 

be  shown,  assigning  Xj^j  the  value  hy  the  above  algorithm  will 

always  gives  the  maximum  solution.  Of  all  the  values  assigned  to 
in  all  possible  solutions,  the  value  assigned  by  this  algorithm  will 
be  the  largest. 


Theorem:  Algorithml  gives  a  maximal  solution  to  the 


Proof 


equation  set  I. 

:  The  proof  is  by  induction  on  the  steps  of  Algorithml 

fk-1 ) 

Assume  that  if  for  each  k£N,  is  substituted  in  all 

equations  of  (I)  for  each  occurrence  on  the  right  of  an<1 

evaluated* that  then ? at  least  for  i  =  1,2,  ...  ,j-1,  the 


equations  for  X^j  '\{2)  ’  ***  *^T(J-1)  wil1  respectively  have 

^^1)  '  ***  ,BI(J-l)  °n  the  right,/  and  that  no  other 


solution  could  assign  these  variables  greater  values. 


This  is  true  for  j-1  =  1  because  the  equation  for  X 
from  I  is: 


Id) 


(0) 


(!)  Xj(1j  «  mlnCTj^)!,  ...  (i^  ^  ^  ^  *  Bi  (i ) ) 


Assume  substitution  is  made  for  each  variable  in  each  of  the 


terms  above.  Since  (I)  is  minimum  monotonic  each 


<  the  value  of  every  variable  of  which  it  is  a  function.  Since 

,(0) 


further  B. 


*1(1)  *s  t*ie  smaHes*'  constant  term(by  definition)  and 
the  constant  Bj  term  gives  a  lower  bound  on  Xj ,  no  variable 
value  is  less  than  Xj^)'3  =  Thus  the  right  side  of  (1) 


,(0) 


must  evaluate  to  Furthermore  no  larger  value  for 


could  ever  be  assigned  in  any  legitimate  solution  of  I  since  by 

<»  *i(D  ^(l)- 


Now  consider  the  general  case.  From  (I)  in  its  initial 
fora  take  the  equation  for 


(2)  X 


Hi) 


=  min 


(%j)i 


B(°)  ) 
I(J) 


After  substitution  for  •••  **i(j-l)on  the  ri8ht 

of  an  evaluation  of  this  equation  (2)  becomes: 

«•)  Xj(  ,  =  .indO-D.iO-i?  ...  ) 


where  the  terms 


contains  only  variables  selected  from: 


{hi j)  ,3i(j+i) 


,xI(n)] 


Attention  to  the  algorithm  shows  that  this  is  the  form  of 
the  equation  for  Xj^  j  in  the  (J-l)-st  version  of  the  equation 


set.  It  has  already  undergone  substitution  and  evaluation  on 
its  right  for  ... 


Further  substitutions  in  the  terms  on  the  right  of  (2')  for 
'id*.  ,j,  r  >  0,  can  only  give  values  for  these  terms  >  to  the 


value  Xjq  j  by  minimum  monotonicity  and  the  fact  that  2. 

*[(3)  for  a11  r  2.  0*  Thus  upon  substitution  for  all  variables  in 
(2* ) : 

(3-1) 

*1(3) =  BI(J-1) 

(0) 

Furthermore,  under  the  assumption  that  none  of  X^j  =  BI(1)’  ’* 
,XI(J-1)  s  BlOl)°0Uld  be  any  larger  and  since  substituting 
these  in  (2)  gives  (2')  then  *10)  can  not  be  any  larger  than  a 


0-1) 


value  satisfying  (2').  But  according  to  ( 2  * ) ,  X-#,^  <  B_, . » 

(3-1)  J 

So  is  the  largest  possible  value  of 


l 

The  Minimum-Constant  Principle  is  the  proposition  that  a  minimum 
monotonic  equation  set  can  be  solved  by  the  above  algorithm  which 
Involves  using  the  miniimum  of  all  constant  terms  for  variables  not 
yet  evaluated  to  make  the  next  variable  evaluation. 

The  algorithm  embedding  the  Minimum  Constant  Principle  is  simple, 
but  deciding  whether  it  is  applicable  is  not  always  nearly  so  simple. 
The  remainder  of  the  paper  is  devoted  to  an  example  of  a  problem  to 


Page  6 


.which  the  algorithm  is  applicable.  The  bulic  of  this  remaining  space 
is  needed  to  show  that  the  Minimum  Constant  Principle  is  really 
applicable. 

An  Example  of  an  Application  of  the  Minimum  Constant  Principle 

Beyond  showing  that  a  solution  to  a  problem  must  satisfy  a  set  of 
Minimum  Monotonic  equations,  we  need  to  show  that  the  solution  desired 
is  in  fact  the  maximum-solution  before  the  Minimum  Constant  Algorithm 
can  be  claimed  to  be  applicable.  If  satisfying  the  equation  set  is 
necessary  to  the  solution  and  the  equation  set  can  be  shown  to  have 
only  one  solution  then  clearly  that  solution  is  the  maximum  and 
satisfaction  of  the  set  is  sufficient.  Thus  the  Maximum  Constant 
algortihm  is  applicable.  It  was  to  such  special  cases  that  this 
algorithm  was  earlier  shown  applicable  [j]].  Minimum  path  graph 
problems,  and  minimum  cost  association  of  matrix  multiplication 
problems  provided  examples  there. 

Consider  now  the  problem  of  finding  the  min-max  cost  of  paths 
between  two  specified  nodes  in  a  digraph  G.  The  min-max  cost  from 

i 

nodes  i  to  J  is  the  minimum  cost  of  all  maximal  branches  on'  paths'  from 
nodes  i  to  j.  The  maximal  branch  on  a  path  from  i  to  j  is  the  one 
with  the  largest  cost  of  all  the  branches  on  that  path.  This  problem 
has  found  application  in  medical  diagnosis  programs.  I  learned  of  it 
through  [2"]  and  discussion  with  A.  Walker. 


Page  7 


A  necessary  condition  which  must  be  satisfied  by  the  oin-oax 
costs  from  nodes  i  to  J  in  a  given  digraph  is  that:  (1)  those  min-max 
costs  must  satisfy  a  certain  set  of  minimum  monotonic  equations. 
However i  in  general,  such  an  equation  set  will  have  many  solutions. 
For  applicability,  then,  it  also  must  be  shown  that:  (2)  the  maximum 
solution  is  the  one  sought.  Both  of  these  points  will  be 
demonstrated . 

Necessary  Condition: 

Corresponding  to  any  digraph  G  with  nodes  1  to  n,  there  is  a  set 
of  n  equations,  E,  in  the  variables  X^,  through  ^  such  that  the  cost 
of  the  min-max  branch  from  node  i  to  node  j  in  G  when  substituted  for 
^  in  E  will  satisfy  E  ♦ 

Let  G  with  nodes  =  {l,2,  ...  ,n  s  n)  be  such  a  digraph. 

Consider  the  quantity  obtained  by  first  finding  the  maximum  cost 

branch  on  each  path  from  node  i  to  node  n,  and  then  choosing  the 

minimum  of  these.  Call  this  quantity  X^,  or  the  min-max  cost  from  i 

to  n.  Let  be  the  cost  of  the  branch  from  node  i  to  node  J  if 

there  is  such  a  branch.  If  not  Cy  =  oo.  B^=ooifi/n.  B^  =  0. 

« 

(The  original  graph  G  may  always  be  replaced  by  one  having  a  branch 
between  every  pair  of  nodes;  those  branches  not  appearing  in  the 
original  being  given  an  oocost;  all  others  having  their  original 
costs.)  Then  the  following  set  of  relations  in  which 
XTX  s  maximum(X,Y) ,  XLY  =  minimum(X, Y) ,  must  hold. 

ej  *  (<iirxi)L(ci2rx2)L  ...  L(clnrxh  M-bJun] 


A  more  detailed  justification  that  this  set  E  is  necessary  is  now 


Page  8 


given  in  which  for  all  nodes  1,  is  a  path  from  node  i  to  node  n. 

is  all  such  paths.  For  any  two  nodes  i,k,  p^  is  a  path  starting 
with  branch  <i,k>  and  going  on  to  n.  P^  is  all  such  paths,  i3 
designates  the  j-th  outward  neighbor  of  node  i. 

Proof  of  necessity  of  E: 


A  path  from  i  to  n  must  be  composed  of  <i,i  >  followed  by  some 

p  A  for  some  neighbor  i3  of  node  i.  Every  such  path  p  .  has  a  branch 

iJ  i.i? 

of  cost  c  i  on  it.  The  min-max  cost  of  all  paths  in  P  .  ,  going 

il3  .  ti3 

directly  from  i  to  iJ  and  then  onto  n,  must  then  be  >  c,  ..  On  the 

“  ll3 

otherhand,  P  v  including  as  it  does  the  suffix  P  ,,  must  have  a 


min-max  cost  >  X>  .  Thus  it  follows  that  for  each  j  the  min-max  of 

i3 

P  A  a  c  ,rx  .  .  The  min-max  path  cost  P  .  is  the  minimum  of  the 
ll3  ti3  i3  1^  A 

min-max  branch  cost  on  all  paths  p  .  over  all  neighbors  iJ  of  i,  and 

ll3 

is  thus  clearly  given  by  the  above  equations,  E. 


Next,  the  fact  that  the  desired  solution  to  this  set  of  equations 
is  tne  maximum  one  will  be  demonstrated.  This  is  the  part  of  the 
demonstration  which,  though  simple,  is  uncomfortably  long. 

Sufficient  Condition: 

I 

.*•  I 

If  E  is  the  set  of  equations  constructed  by  the  above 

considerations  from  G,  then  the  maximum  solution  of  E  for  X  is  the 


The  first  of  these  transformations,  ,  involves  removing  one  of 
the  variables  appearing  on  the  left  of  an  equation  in  E  from  all  its 
right-side  appearances  in  that  equation.  The  corresponding  graph 
transformation  involves  the  removal  of  a  self-loop. 

The  second  transformation  Xp  of  an  equation  .set  E  is  obtained  by 
substitution  for  a  variable  appearance,  say  Xft,  on  the  right  of  an  an 
equation  for  say  X^with  the  right  side  of  the  equation  for  ,  and 
the  gathering  of  terms  so  as  to  put  the  result  back  in  the  same  form 
as  the  original  equation.  (Xft  ^  X^) 

Assume  that  a  transformation  is  applied  to  an  equation  set  (and 
corresponding  graph)  E(G)  to  obtain  E’(G')«  We  will  show  that  if  that 
transformation  is  T^,  then: 

(1)  The  min-max  cost  between  any  pair  of  nodes ,i  and 
j  in  G  and  G'  are  the  same. 

(2a)  Every  solution  of  E'  is  also  a  solution  of  E. 

(2b)  Every  solution  of  E  is  ^  to  every  solution  of  E’ . 

If  that  transformation  is  T£,  then: 

(1)  The  solutions  of  E'  and  E  are  identical.  The  min-max 

path  costs  are  identical  in  G  and  G* .  * 

» 

As  long  as  only  these  transformations  of  E(G)  to  EUG')  are  used 
the  maximum  solution  of  E(G)  will  remain  a  solution  of  E'(G').  Thus 
when  a  series  of  these  transformations  is  shown  to  lead  to  an  E(G) 


with  a  unique  solution,  that  solution  is  guaranteed  to  be  a  maximum 


Page  10 


II :  Loop  removal : 


Transformation  consists  of  removing  a  self-loop(a  branch 


from  a  node  to  itself). 


First,  (1)  we  will  show  that: 


Lemma  1.1:  Application  of  to  a  graph  G  leaves 
a  resultant  graph  G*  having  all  the 
same  min-max  costs. 

proof:  Assume  that  G  has  a  self-loop  on  node  a.  Let  the 

paths  from  i  to  j  in  G  be  partitioned  into  the  sets  ^  and  Pg 

so  that  contains  all  those  paths  in  which  the  self-loop 

branch  <a,a>  appears  at  least  once.  Then  the  set  of  all  paths 

from  i  to  j  in  G*  is  given  by  P*  n  P_,  in  which  P'  is  such  that 

1  2  1 

for  each  path  p^  in  P^  there  is  a  path  p^'  in  which  is  like 
p^  except  that  all  branches  <a,a>  are  removed.  But  all  such 
paths  in  P^  are  then  paths  in  G  in  which  no  branch  <a,a> 
appears  and  thus  are  in  P  .  So  the  set  of  all  paths  from  i  to 
J  in  G's  .  Now  it  will  be  shown  that  a  min-max  cost  branch 
of  G  from  i  to  j  must  be  in  one  of  the  paths  of  P^ .  From  this 
it  follows  that  it  will  be  the  same  min-max  cost  as  that  of  G* . 


To  show  that  a  min-max  cost  branch  from  i  to  j  in  G  is  on 
a  path  in  Pg  we  need  some  definitions.  Let  maxbc(p)  be  the 
oost  of  the  maximum  cost  branch  on  path  p.  If  and  Pg  are 
paths  we  say  p^>  Pg  when  Pg  consists  of  a  subset  of  the 
branches  of  p^.  If  p^fc  Pg  then  obviously 
■axbc(p^)  >_  raaxbc(Pg) .  Clearly  by  definition  of  P^  and  P^  for 
each  path  p.  e  P.  there  is  a  path  p^  €  Prt  with  p  >  p  then  for 


Page  11 


each  value  maxbc(p^)  there  Is  a  path  in  whose  value 

maxbc(f^)  <_  maxbc(p^)  and  consequently: 

oln(maxbc(p))  <  min(maxbc(p)) 

P‘P2  -  P‘TZ 


or: 


min(maxbc(p))  =  min(maxbc(p) ) 
pePjUPg  peP2 


Since  min(maxbc(p))  is  the  min-max  cost  from  i  to  j  in  G, 
2 

that  min-max  cost  involves  only  the  paths  in  as  asserted. 


So  transformation  ,  loop  removal,  does  not  effect  the  min-max 
cost  from  i  to  j.  Next  it  is  necessary  to  show  (2a)  that: 


Lemma  1.2a:  The  set  of  equations,  E1 ,  corresponding  to  G 

will  have  solutions  each  of  which  will  also  satisfy  E,  the 
set  of  equations  corresponding  to  G. 
proof:  In  equation  terms  the  transformation  involves  a  change  of  a 
recursive  equation,  i.e.  one  in  which  the  same  variable 

appears  on  the  left  and  right.  Assume  that  the  variable  X  is 

a 

involved  in  such  a  recursive  equation  and  T^,  is  applied  to  that 

equation  then,  with  fi  having  an  expression  involving 

* 

■inimums  of  maximums  of  constants  and  variables: 

In  E  becomes: 

Xas  P 

in  E' .  All  other  equations  go  into  E*  intact. 


Page  12 


Consider  any  solution  to  E* .  Let  7^  have  the  value  v(V 
In  this  solution.  When  these  values  are  substituted  into  an 
expression  ft  its  value  is  designated  v(/3).  Let  us  see  how 
these  values  work  in  the  E  equations.  For  those  equations 

X.  s  e  ,  which  go  over  unchanged  into  E* ,  evaluation  yields 

J  J 

X  s  v(e  )  in  both  E*  and  E.  For  the  X  recursive  equation  of 
J  J  ® 

E'  and  the  corresponding  equation  of  E  however,  X  =  v(fl )  and 

A 

X  s  (c  (v(X  )r v(/3 ) )  are  the  respective  evaluations.  But 

A  AA  A 

v(X  )  =  v(/3 )  by  definition  since  that  is  its  value  in  E'  so: 

A 

X  =  (c  rv(/3))Lv(/3) 

A  AA 

X  =  c  rv(/0)  if  c  >  v(/3) 

A  AA  AA- 

a  v(/3);  else 

X  =  v(/3)LV</3)  if  C  <  V(£) 

A  AA  - 

a  V(ft) 

Thus  any  solution  of  E'  is  also  a  solution  of  E. 


Next  it  is  necessary  to  show  that: 

Lemma  1.2b:  For  any  solution  S  of  E  which  is  not  a  solution  of  E* , 

there  is  another  solution  S'  which  satisfies  both  E  and  E1 
In  which  each  variable  has  at  least  as  large  a  value  as  it 
has  in  S. 

Proof:  Assume  that  the  ^  transformation  from  E  to  Er  corresponds  to 
the  removal  of  a  self-loop  on  node  a  of  the  corresponding 

graph,  i.e.  recursion  in  the  equation  for  X  .  Thus  the 

A 

difference  between  E  and  E*  is  that  the  first  has  the  equation 
r  whereas  the  second  has  X^=  ft.  For  each  of  the 


Page  13 


other  variables  the  same  equation  Is  in  E  and  E' .  Now  let  X^ 
have  the  value  w(}^)  In  a  solution  of  E  and  an  expression  p  in 
the  variables  have  the  value  w (fi) .  Inparticular : 

X  *  (c  lv(X  ))|_w(/0  =  w(X  ) 

a  a«.  a  a 

if  c  >  w(X  )  then 

aa  a 

Xa  s  eaaLW^)  =  w(V  =  w(/5) 

if  c  <  w(X„)  then 

aa  ~  a 


Xa  s  =  w(V 

In  general  then  in  E: 

(1)  w(^3) 

If  these  same  values  w(X^)  are  substituted  in  E'  for  X^  we  get: 
(2)  X^  s  w(/3)  (note  p  does  not  contain  X^) 

A  difference  of  solutions  for  X  in  E  and  E*  only  exists  if  the 

a 

<  holds  in  (1).  Assume  that  to  be  the  case,  i.e. 
w(X^)  =  w(^)-e,  €>0,  and  consider  the  following  iterative 
process  on  the  equation  set  E' .  Substitute  on  the  right  of  the 

equation  of  E'  for  each  X*  the  value  of  w(X. ),  also  designated 

(0) 

X^  ,  and  evaluate.  The  resultant  value  computed  for  each 
in  E’  is  designated  X^*^.  Note  that  it  follows  from  the  above 
discussion  that  X^  X^^for  all  1<i<n.  Generally  designate 
by  x|^  the  value  of  X^  which  results  when  x|^  ^  is 
substituted  on  the  right  of  the  equations  of  E*  and  the 
resultant  expressions  evaluated.  Then  it  is  easy  to  see  that, 
because  of  the  properties  of  the  minimum  monotonic  equations 
each  x|^  xj^  Because  of  the  min-max  nature  of  these 
equations  also  the  value  that  any  particular  variable  can 
assume  in  these  evaluations  is  confined  to  a  member  of  the  set: 


{w(X1)|i6N]U('ci.|i€N,j£N]  U  [BjitN] 


Page  14 


This  is  so  because  the  right  side  of  these  equations  is  always 

the  minimum  of  maximums  of  these  values  and  .thus  must  equal  one 

of  them.  For  some  finite  J  then  will  equal  X^  ^  for  all 

i€N,  since  their  values  are  chosen  from  a  finite  set  while 

always  either  increasing  or  remaining  unchanged  with  an 

r  M>.  7 

increase  in  j.  Clearly  for  this  value  of  j,  [X^  | ie N 5  is  a 

solution  to  E' ,  and  it  is  a  solution  in  which 

x|^>,  ~  w(X^)  by  transitivety  and  thus  is  greater  than 

the  solution  to  E. 


This  completes  the  demonstrations  of  the  properties  of  T^. 


Next  consider  transformation  Tg.  This  transformation  is  first 
defined  precisely. 


Consider  a  set  of  min-max  equations  E  like  those  on  page  7.  The 
equation  for  7^  has  the  form: 

■  £Vir]i«0a2rx2*-  •■•Lt>.nrxJLBn 

Consider  the  substitution  of  the  right-side  of  this  equation  for  the 
appearance  on  the  right  of  another  euqation  for  say  X  ,  b  /  a. 

=  £cbirx]3'-  •  •  •  Ccajfx^]]L  . . .  . . . 

•"Ltcbn1>l-Bb 

•  •  •  LB  cbnL(  %»r  c«n>  X- L  PbL<  o^rB. )] 


Let  E'  be  the  same  as  E  with  the  exception  that  the  equation  for 

X.  is  replaced  by  (1*)  above  with  its  coefficients  evaluated.  Then 

D 

the  substitution  and  evaluation  which  produced  E*  from  E  is 


Page  15 


transformation  Tg. 

If  the  original  set  E  corresponds  to  graph  G  the  G'  corresponding 
to  E*  above,  created  by  the  transformation  Tg»  is  clearly  G  with  the 
cost  of  the  branch  from  b  to  j  changed  to  ^LCe^Pc^)  if  J  /  a,  and 
with  the  cost  of  the  branch  from  b  to  a  set  to  .  The  change  from  G 
to  G'  is  illustrated  below. 


The  claim  now  is  that  any  solution  of  E  is  a  solution  of  E*  and 
vice  versa.  Thus  the  min-max  path  cost  in  G  and  G’  will  be  the  same. 


Lemma  2.1:  If  E'  results  from  E  by  transformation  T, »  then 
E  and  E»  have  the  same  solutions. 

Proof:  E  and  E’  differ  only  in  their  respective  equation  for  X^. 

Suppose  in  a  solution  of  E  the  value  of  is  v(X^),  1<i<n. 

Then  for  each  equation  except  that  for  X^  these  same  values 
will  when  substituted  on  the  right  in  E'  give  the  same  result 
as  in  E  for  the  corresponding  equations.  We  can  see  by  the 
equality  of  (1)  and  (1')  for  all  variable  values,  that 
substituting  v(J^)  for  X^  on  the  right  will  result  in  X^  in  E' 
being  equivalent  to  the  following  (1)  form  equation: 

*  Ccbi"v^xi  •••  LCcbaf^ L  •  ■  ■ 
•••l[W'v(V]LbJl  L&tnrv(  V-JLBb 


Page  16 


Because  the  underlined  subexpression  is  the  right  side  of  the 


equation  for  with  v(X^)  substituted  for.  all  variables,  its 


value  is  v(X  ): 

a 


V  &bir,<xl!]L  •••  LCob/v<x*,']L  •••  febnry<xn3LBb 

which  has  the  same  right  side  as  X^  in  E  when  evaluated.  So 

in  E* : 


V  v(xb> 


On  the  other  hand,  suppose  in  a  solution  for  E’ ,  X^  is  given 


the  value  w(X^).  Substituting  these  values  on  the  right  of 


all  equations  in  E  except  X.  will  give  the  same  values  for  all 

o 


variables  in  E  except  X^.  For  X^  in  E  we  have: 

V  Ccbf“<XL0L  •"  "•  t°bnr“(Xn3LBb 

but  in  E* ,  again  because  of  equality  of  ( 1 )  and  ( 1 • )  and  the 
fact  that  (c^jTwCXjJJL  ...  I_(  c^r  w(  XR)L  ^ ) ,  (the  underlined 
subexpression  in  the  previous  paragraph) ,  is  the  evaluated 
right-  side  of  X&  in  E' ,  X^  will  have  the  same  evaluated 
right-side  as  given  above  for  E.  Thus  w(X^)  will  be  the  value 
of  X^  in  E  also. 


We  already  know  that  amongst  the  solutions  to  E  is  the  min-max 
path  costs  for  the  corresponding  graph  G. 

Consider  now  a  series  of  applications  of  and  ^  starting  with 
an  initial  graph  G  and  corresponding  equation  set  E.  Let  these  yield: 

E(G)  s  E1(G1)  -*•  E2(y  ■*  •••  ■*En(Gn) 

In  which  E^(G^)  -*■  E^+j(G^+^)  indicates  that  or  Tg  were  used  to  get 
E*  «  from  E«.  Assume  further  that  in  E  there  are  only  constants  on 


Page  17 


the  right  of  each  equation,  then: 


Theorem  2:  (1)  evaluation  of  the  right-sides  of  E  .  for  X^’s 

give  the  min-max  costs  in  G  from  node  i  to  node  n  and  (2) 

the  solutions  in  E  will  be  maximum  solutions  of  E. 

— - n - — > - 

Proof:  (1)  follows  from  the  following  inductive  argument: 

Assume  that  amongst  the  solutions  to  E^  are  the  min-max 
path  costs  of  G,  and  that  the  min-max  cost  of  is  equal  to 
the  min-max  cost  of  G.  If  E^+^  is  obtained  by  Tg  from  E^  then 
by  lemma  2.1  it  will  have  the  same  solutions  as  E^  and  so  will 
also  include  the  min-max  costs  of  G  and  G^+^will  have  the  same 
min-max  costs  as  G.  If  G^+^  is  obtained  by  from  G^  then 
since  by  lemma  1 . 1  G^+  ^  has  the  same  min-max  path  costs  as  G^ , 
it  will  have  the  same  such  costs  as  G. 

(2)  follows  with  a  similar  argument  from  the  fact  that 
under  transformation  of  E^  to  E^+^  by  T^,  although  solutions 
may  be  lost  they  are  all  by  (lemma  1.2b)  smaller  than  the 
remaining  ones.  And  under  lg  no  solutions  are  lost  as  stated 
by  lemma  1.1.  Therefore,  when  only  one  solution  remains  it 
must  be  the  maximum  solution  to  E. 

* 

.»•  t 

Now  there  are  a  number  of  orders  in  which  the  transformations 

and  can  be  applied  to  obtain  the  sequence  of  equation  sets 

Et  L,  ...  ,E  with  E  having  only  constants  on  the  right.  We  give 
i  n  n 

one  such  order  of  application  which  is  analogous  to  the  process  used 
In  Gaussian  Elimination. 


A. 


Page 


Algorlthm2 

E  ■+-  Equation  Set 

J<-1 

i  -t-  1 

T1:  if (in  E^  the  equation  for  Xj  has  Xj  on  the  right,  i.e.  is 

recursive)  then  (remove  that  recursion  by  applying  T^;  i<-  i+1) 
if  (j=n)  then  DONE 

k  1 

T2:  if  (on  the  right  of  the  equation  for  Xj+k,X;J  appears  in  ) 
then  (use  Tg  to  substitute  with  the  right-side  of 
Xj  equation  on  the  right  of  the  Xj+^  equation  thus 
eliminating  Xj  from  the  right  of  Vic  equation;  i-*-i+1) 
if  (j+k=n)  then  (J«-j+1;  go  to  T1) 
k<-  k+1 
go  to  T2 


Each  of  the  above  steps  can  obviously  be  carried  out.  As  a 

result,  variables  are  continually  eliminated  from  the  right  of 

successive  equation  sets  in  the  sequence  E  s  E.  ,  ...  ,E  .  In 

a  n  « 

going  from  Ej  to  Ej+j  variable  Xj  is  eliminated  from  the  equation  for 
itself  by  Tj  if  necessary,  and  then  from  the  equations  for 

2*  *’*  ,Xn  by  T2*  This  is  d0ne  for  ^  s  1  t0  n-1»  finally 

leaving  the  final  equation  set  with  no  variables,  only  constants, 

on  the  right  of  each  equation. 


The  point  of  this  entire  development  is  that  the  maximal  solution 
to  the  equation  set  E  corresponding  to  graph  G*.  gives  the  min-max  paths 
from  node  i  to  node  n  in  G.  Therefore,  Algorithml  -  incorporating  the 
minimum  cost  principle  which  is  more  efficient  than  Algorithm2  -  can 
be  used  to  get  the  min-max  path  cost. 

Final  Note 

The  above  properties  of  transformations  and  ^  were  shown  to 
be  applicable  to  the  min-max  path  set  of  equations.  In  fact,  these 
properties  must  apply  to  a  large  class  of  such  equation  sets, 
including  the  min-max  set  as  one  member.  Virtually  any  equation  set 
has  the  properties  requires  of  Tg.  The  properties  of  on  the 
otherhand  will  only  be  true  for  a  constrained  class  of  equation  sets. 
For  this  larger  class  of  equation  sets  Algorithm2  would  surely  be 
applicable  to  get  the  required  maximal  solution.  Furthermore,  if  the 
particular  equation  set  satisfies  the  conditions  for  application  of 
the  minimum  constant  principle  also,  then  the  more  efficient 
Algorithml  would  be  applicable. 

t 

•  ■  : 

The  class  to  which  the  desireable  properties  apply  appears  to 
have  considerable  similarity  with  the  closed  semi-ring  as  defined  in 
section  5.6  of  [b].  We  are  currently  at  work  on  its  delineation. 

T*— ^formation  T2  maintains  its  properties  for  virtually  any  equation  set. 
Ta  on  the  other  hand  will  only  maintain  its  properties  under  restrictive 
oonditions . 


References i 

[i]  Pauli,  M.C.i  A  principle  Useful  In  The.  Design  Of  Minimum  Path 
And  Other  Algorithms;  To  be  published. 

[21  Ng»  Shuey;  Walker,  A.;  Max-Mln  Chaining  Of  Weighted  Assertions 
Is  Loop-Free;  Internal  Report  CBM-TR-73;  Dept,  of  Computer 
Science,  Rutgers  -  The  State  University  of  New  Jersey,  1977 

C3J  Aho,  A.  V.;  Hopcroft,  J.E.;  Ullman,  J.D.;  The  Design  and 
Analyses  of  Computer  Algorithms;  Addison  Wesley,  1975. 


SOSAP-TR-36 


July  1977 


APPROACHES  TO  AUTOMATIC  PROGRAM  GENERATION 
S.  V.  Levy 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

New  Brunswick,  New  Jersey 


This  research  was  supported  by  the  Advanced  Research  Projects  Agency 
of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6  to  the 
Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U.  S..  Government. 


APPROACHES  TO  AUTOMATIC  PROGRAM  GENERATION 


The  object  of  this  research  has  been  to  gain  insight 
into  the  problem-solving  process  in  general  and  the  programming  process 
in  particular.  We  chose,  for  our  vehicle,  a  restricted  class  of 
problems  -  the  generation  of  programs  for  determining  order  character¬ 
istics  of  a  set  of  numbers  (selection,  sorting,  etc.). 

There  are  three  levels  of  knowledge  which  we  used  in  the 
generation  of  'good'  programs;  knowledge  about  programming,  knowledge 
about  sorting  and  comparing,  and  knowledge  about  numbers  and  their 
ordering  properties. 

Our  earliest  work  considered  the  general  subject  of  generat¬ 
ing  algorithms  for  arranging  numbers  on  the  basis  of  numerical  comparisons, 

A  specification  of  the  desired  arrangement  of  the  data  in 
the  output  sequence  serves  as  input  to  the  "automatic  programmer"  which 
generates  a  program  to  rearrange  the  input  data  into  that  order.  It  is 
necessary  to  specify  a  rule  for  manipulating  the  data,  and  a  rule  for 
determining  that  the  program  has  successfully  completed  its  task.  The 
first  rule,  since  it  is  almost  completely  concerned  with  order  on  the 
data,  is  easy  to  specify  -  the  second  rule,  the  stopping  condition, 
turned  out  to  be  considerably  more  complex. 

Rule  1  is,  in  essence.  Compare  two  inputs;  if  they  don't 
satisfy  the  definition  of  the  output  condition,  manipulate  them  so  that 
they  do;  if  they  do  satisfy  the  definition,  repeat  the  process  with 
another  element.  This  rule  is  necessarily  vague  at  this  point.  We  shall 
see,  below,  how  it  might  actually  be  affected  in  practice. 

Rule  2  is  considerably  more  complex  in  its  interpretation 
and  could  easily  produce  substantial  payoffs  in  the  running  time  of  the 
programs  produced  by  the  system. 

The  simplest  algorithms  which  we  produce  use  oAly  the  space 
in  which  the  inputs  are  presented  (plus  perhaps  a  fixed  size  work-  . 
space,  independent  of  the  size  of  the  input  set) .  More  sophisticated 
algorithms,  which  are  able  to  make  use  of  the  transitivity  and  other 


-2- 


properties  of  the  order  relation  on  real  numbers  seem  in  our  experi¬ 
ence  to  make  use  of  a  space  which  is  dependent  on  the  size  of  the 
input  set.  In  order  to  derive  these  more  sophisticated  variants,  it 
is  necessary  that  the  automatic  programmer  include  some  type  of  theorem 
generator.  Some  simple  examples  of  output  spec  ifications  and  the  pro¬ 
grams  we  expect  and  hope  to  produce  will  make  this  last  section  more 
clear. 

Examples 

Output  specification  1:  (i) 

where  [B^  i*l,2,..,  n  is  the  output  set 
In  this  case,  the  program  that  would  be  produced  is  obvious  -  at  least 
as  far  as  manipulating  the  data.  The  part  of  the  program  which  generates 
a  stopping  condition  is  less  obvious.  The  algorithm  consists  of 
comparing  the  first  two  inputs:  if  the  larger  is  already  on  the  right, 
leave  the  elements  in  their  present  order  and  compare  elements  2  and  3. 
Continue  this  until  all  elements  are  compared,  then  go  back  and  make 
another  pass  through  the  input  set.  This  is  repeated  until  all  the 
inputs  are  placed  in  their  correct  position. 

How  can  the  program  decide  that  all  the  data  are  ordered? 

Several  methods  come  to  mind.  Fortunately  in  this  case  the  most  obvious 
method  works  -  when  the  algorithm  makes  a  complete  pass  through  the  input 
data  but  performs  no  interchanges,  the  data  is  ordered.  In  order  to  know  this 
has  occurred  it  is  necessary  for  the  algorithm  to  keep  track  of  the  number 
of  exchanges  on  each  pass,  or  at  least  whether  or  not  an  exchange  has 
been  made.  An  alternate  stopping  condition  which  results  in  a  generally 
poorer  algorithm  (at  least  as  measured  by  running  time)  makes  use  of 
the  fact  that  after  nfo.~-P.  comparisons  every  element  will  have  been 
compared  to  every  other  element,  and  if  it  can  be  shown  that  the 
algorithm  always  moves  in  the  correct  direction,  then  it  will  always 
be  complete  after  that  many  comparisons. 


•aoaxd  jadoad  six  oi  paisSBdojd 
sx  indux  paj3pao-q.aX-SB-q.ou  isaSjBX  aqi  sssd  qosa  uo  aaaqw  uoxqoajas 
jo  qjos  axqqnq  sb  axqBazxuSooaa  Xxxsaa  sx  urej8ojd  srqj, 

•pug  s  -V  dais  aaqjv 

50108  Tf4rd  11 

I+d*d 

‘£  aaqjB  qqSxJ  pus  z  JtoigB  iq8x>j 
0=d  *0  daqs  aaojag 

rsdais  8uxmoxx°J  aqi  apnxoux  pxnon  suosxjBd 
-moo  jo  aaqmnu  aqi  jo  asn  saqatn  qoxqM  uoxixpuoo  Suxddoqs  aieuaaiXB  uy 
*  (d  axqBTJBA  am  jo  anxBA  aqi  Xq  pajBOxpux  sb)  aoBxd  aqBi  sa8uBipJaiux  on 
qoxqM  trc  apBtn  sx  ssBd  -b  jx  aiBxrpiiiai  oi  nreaSoad  aqi  moxx*  TIT*  sxqi 

0  0108 
0-d  S 
png 

S  0108  x=d  31  -f  dais  aaijy 
I*d  •Cl+l)V*Cl)9  s-ojaq  £  dais  jy 
:sdais  Suxmoxxoj  aqi  ppB  aw  msaSoad  aqi  aiaxdmoo  ox 

*mqixjo8xB  aqi  jo  uoxiBurnijai  aqi  XxaiBmxixn 
pus  smdux  aqi  q8nooqi  sassBd  paiBadaj  aqi  saxpusq  qoxqM  nreaSoad  am  jo 
uoxioas  am  ux  ind  oi  iaX  9ABq  a« — sindux  aqi  qSnoxqi  ssBd  ax8uxs  b 
joj  bibp  am  saiBxndxusm  qoxqM  Xpoq  aqi  Xxuo  sapnxoux  jbj  os  nreaSoad  aqi 

X  0108  n»«I  31  t 
V  oiog 
I+I-I 
(I)V=(I+I)V 
CT+I)V*(I)8  £ 
fr  0109 
T+I-I 

(l)V-(l)fl  Z 

w 

£  0108 

Z  0108  (X+l)V  Cl)V  31  T 

1*1  0  Xpog 

mBj8oJd  aqi 

-£- 


'Output  specification  2:  (i)(j)(i$ 

In  'this  case  again  the  data  manipulation  part  of  the  program 
is  obvious ,  but  unless  some  cleverness  is  introduced* into  the  termina¬ 
tion  decision,  the  algorithm  produced  will  be  very  inefficient. 
Essentially  the  algorithm  takes  1  element  at  a  time,  compares  it  to  all 
the  others,  and  places  it  between  the  set  of  elements  greater  than  it 
and  the  set  smaller  than  it.  This  is  clearly  insertion,  and  on  each 
pass  one  element  is  put  into  its  final  place.  Consequently  the  algorithm 
terminates  after  N  passes. 

The  Program 

1.  I»1  we  place  A(I)  at  each  pass 
K-l  It  is  placed  in  position  K 
J-l 

2.  If  Jjl  goto  3 

J-J+l 

If  J>N  goto  5 

3.  If  A(I)<  A(J)  goto  4 

K-K+l 

4.  J»J+1 

If  J<N  goto  2 

5.  BCK)-A(I) 

I«I+1 

if  IS  N  goto  1 
End 


In-addition  to  the  program  generator  which  seems  very  straightforward 
it  would  be  necessary  to  produce  some  sort  of  generator  of  stopping 
conditions.  This  would  seem  to  be  a  very  complex  part  of  the  system. 

It  would  be  concerned  with  proving  such  properties  of  candidate  pro¬ 
grams  as  they  are  moving  in  the  correct  direction  (e.g.,  it  is  never 
necessary  to  repeat  an  exchange  of  two  data  which  are  once  exchanged  by 
the  algorithm).  This  property  together  with  a  bund  on  the  number  of 


Sujxaos  ,poo8,  O)  suq3xao8iB  asaqa  aoaj  8uxaoo  uaq3  puv  s^ndut  -  z  X0J  smqjtaoi 
-XB  8ux3aos  ,poo8,  8ux3Baaua8  303  anbxuqaaa  aATsanoaa  b  ‘(6961  'iTnBj  ,Aa9^) 
BAqaSiB  8ux3jos  IinBd-XAaq  aq3  apnxoux  ST003  axssq  aqj.  ‘aanpaaoad  uSjsap 
aq*  ux  xnjasn  ajB  qoxqM  SX003  x®T3®ds  padoiaAap  aABq  pus  smq3xao8iB  asaqa  jo 
sassBxo  UTB3J03  30  u8jsap  aq3  q3?M  aauaxaadxa  30  xB3P  JBaaS  b  aABq  aji|  (j 

'aovds 

• 

aSvaoas  umax ups  ux  qao*  03  apsis  aq  IIs  ubd  smq3xao8xB  aAx^dBpB-uou  aqa  asnsaaq 
uoxjb anduoo  pue  aSsaoas  uaaMaaq  sjjoapBii  30  snaxqoad  ou  ojcb  aaaqj.  ’uox^B^ndaoa 
aqa  ouojjad  03  paaxnbaa  suoiasaado  uosxaBdsioo  30  aaqnmu  ai|3--aidmxs  sj  svqaxa 
-08 IB  asaqa  30  Xaxxaxdmoa  30  oansBam  aqa  ‘uoxajppB  uj  *  ssaaSoad  ano  aiBniBA*^ 
03  sn  axqBua  03  sb  os  pauxB3U03-xX9M*  Xi3uaxaxj;ns  sx  amxa  auras  aq3  3B  anq- 
‘  suraa8o3d  x*®"1  jo  uox3Bax3  aq3  inoqe  Suxqaaaos  sn  XT®J  03  saxos®**  ®q*  *oj  pv&. 
8ux3saxa3ux  maxqoxd  aq3  aqum  03  Xaxxaxdmoo  3uajoxjjns  30  st  uxBoop  am  (x 

:  suossaa  XBxaAas  303  smqaxaoSxB 
xo3BXBdnioo  aAT3dBX>B-uou  aq3  Xpn3S  jo  uxramop  ano  sb  asoqo  on 

•sraaxqoad  aBxnaxaaBd  oaxos  03  sureaSoad  30 
uox3Bxaua8  aq3  uo  aeaq  03.  asxaxadxa  XBxoads  ano  uo  8uxaq  pxnoa  an  aaaqw 
3nq  ‘sraaxqoad  xBuxSxao  ano  30  Xubui  qaxM  paaBj  aq  xXTJ®  pxno«  aM  axaqn 
uxBinop  paonpaa  Xx3Bax8  b  sbm  papaau  an  3Bq«  3Bq3  sn  03  asaxo  pamaas 
31  •  sraaxqoad  SuxaBaauaS  urea8oad  uxBtn  aq3  03  uoxauaaaB  3uaxoxj3ns 

axraaad  30U  pxnoM  pure  (ssaoans  jo  saaaSap  paxxra  XaaA  qaxM)  saaqoasasaa 
aaqao  Xubui  Xq  paxpnas  Suxaq  XpsaaiB  sraaxqoad  3xnDxjjxp  03  yion 
aq3  jo  ipnui  003  3-iaAxp  pxnoM  rnaasXs  UMoxq-XXOJ  ®  aasaao  038UXX33 
Xq  33B3S  03  3Bq3  pamaas  ax  *saan3Baj  asaq3  XT®  a-xxobaa  paapux  Xum 
aaramBa3oad  axaBraoanB  pnjssn  Xxnaa  Xub  a®qj  asaxo  sraaas  31  q8noqaxv 

•raq3xao8xB  puooas  aq3  ux  paaxnbaa  suosxaBdmoo 
30  aaqumu  aq3  Xxqaaapxsuoo  aonpaa  03  axq®  aq  pxnoM  3T  uox3®xaa  8trc 
-aapao  aq3  jo  X3iAX3xsuBa3  ®M3  3noqB  ,  Mau^ ,,  ssaooad  8ut3Baaua8  orea8oad 
aq3  3T  *axdmBxa  10 j  *uiBa8oad  aq3  03UX  paanpoa3Ux  aq  03  saxauaxaTjja 
axqxssod  aaAoasxp  03  axqxssod  aq  3q8xra  3T  saaqmnu  30  saxwadoad  aq3  3noqs 
8uxq3amos  Mainf  qaxqM  aaAoad  maaoaq3  30  3x0s  araos  axan  aaaqa  sapxsaq'ji 

•mq3xao8xB  aq3  JC 

uox3Bupaaa3  30  3003d  puB  uo  punq  b  apxAoad  pxnoa  sa8uBqoaa3ux  axqxssod 


K  K 

algorithms  for  2  +i  inputs  (i<<2  );  the  Batcher  odd-even  Merge  (Batcher,  1968) 
strategy;  ahd  the  splitting  strategy  which  we  describe  below. 

These  algorithms  have  as  their  basic  operation  a  comparison  between  data 
stored  in  two  memory  locations  followed  by  the  assignment  of  the  smaller  of 
thfe  two  inputs  to  one  location  and  the  larger  to  another.  The  algorithms  are 
called  non-adaptive  because  the  sequence  of  operations  is  totally  independent 
of  the  contents  of  the  storage  locations,  but  is  completely  determined  at  the 
start  of  the  algorithm.  The  algorithms  can  also  be  carried  out  by  networks  of 
'comparators'  (a  comparator  is  a  2  input  -  2  output  computational  device  which 
operates  on  numbers  as  inputs  and  produces  as  outputs  the  minimum  of  the  inputs 
at  one  (designated)  output  and  the  maximum  at  the  other  output) .  The  networks 
are  formed  by  connecting  outputs  of  comparators  to  inputs  ‘of  other  comparators 
in  an  obvious  (loop-free)  way. 

There  have  been  several  studies  of  networks  for  sorting  (Levy,  Paul 1,1969; 
Batcher,  1968;  Van  Voorhis,  1972;  Knuth,  1973)  which  have  appeared  in 
the  literature— we  shall  ignore  the  general  sorting  problem  here  and  instead 
focus  on  the  following  problem: 

The  selection  problem  -  to  find  the  ith  largest  input 

(Note  that  when  we  look  for  the  ith  largest,  we  always  assume  that  i-  y  . 

til  * 

If  this  is  not  the  case,  then  we  would  look  for  the  (n-1)  smallest. 

The  transformation  between  the  algorithms  is  obvious.) 

The  goals  of  this  research  are  to  learn  to  design  programs,  not 
networks;  and  we  do  not  treat  this  as  a  network  design  problem.  We  are 
interested  in  the  question  of  how  one  goes  from  the  specification  of  a 
problem  to  the  specification  of  an  algorithm  to  solve  -that  problem. 

It  should  be  obvious  that  non-adaptive  algorithms  (NAA  -  from  here  on) , 

•in  general,  prove  to  be  less  efficient  than  unrestricted  programs  to  compute 
the  same  functions  (i.e. ,  the  NAAs  require  a  greater  number  of  comparisons). 

To  find  either  the  largest  or  the  smallest  input  (out  of  n)  requires  n-1  com¬ 
parisons  in  both  the  NAA  and  the  program  case.  This  is  easily  shown  to  be 
optimal  -  and  realizable.  However,  to  find  the  2n<*  element  requires  n+1  logn  1-2 
comparisons  in  the  unrestricted  case  and  2n-3  in  the  NAA  case.  The  former  is 
optimal  and  depends  on  the  algorithm^  locating  the  first  element  and 


xuavato* 

xoiejudmoo  a|Suxs  b  Xq  uoxxBjado  sxqx  xuasajda-x  ubd  an  uoxxbxou  jfjonxau  ur 

C*^i  ,2li)  xbb  .► 

(Hj  U|m  +  Tq 

sb  paxa.xdxaxui  sx  qoxqn 

'.**■  (*S  ,X1)  *«>o  -  (S  ,T1) 

sb  paxuasa.xda.x  aq  Xbb  VVN  aMX  jo  uoxxBjado  otssq  am. 

• SUOSTJBdmOO 

S-u+jj-u+t-u  ■  9-uf  ueqx  ja« o}  ux  xndux  pjtqa  aqx  puxj  ox  Moxaq  «oqs  nsqs  a*  SB 
otqxssod  si  it  pus  ! puooas  si  qotqM  puB  x$<x?J  sx  qbxqM  Mouq  ox  Xiassaoau  xou  sj 
xx'pxxqx  aqx  puxj  ox  *  isasmoh  ’s.vvn  <*oj  yuaxoxjpjns  pub  XoiBssaoau 
aq  ox  UMoqs  aq  ueo  xndux  puooas  aqx  pu|j  ox  suosii8d«OD  *£  -  ug 
*  ((l”u)+  Z~  «)  »M»  ‘ajojajaqx  *xsaxj  aqx  moujj  ox  Xjtessaoeu  s*  X? 

xuaoaxa  puooas  aqx  puxj  ox  -zapao  ui  *sxuamaia  x-u  8uxuxstnaj  aqx  jo 
XsaxiBtus  aqx  puxj  *sx  xuqx  ‘sscd  xsaxj  aqx  xsadaa  ox  sx  VVN  aMX  Xq  auop  aq  ubo 
xsqx  xsaq  aqj,  ’VVN  uo  ux  aiqxssod  xou  xnq  ‘urea8ojtd  e  qxT*  op  ox  Xsbo  sx  sxqx 
*anxx  xsjxj  aqx  xsox  sxuavaia  qoxq*  ,aaquaoaa,  ox  XjBSSaoau  sx  XT  nqxT<zoSxB  aqx 
ouojxad  ox  -iapxo  uj  (**x  , xsaq,  pinoo  xndux  xaqxo  ou — xsjtxj  aqx  ox  ,XSOi,  8ABq 

ox  psq  puooas  aqi)  *punoxs  amxx  xsxxj  ®4X  xuauaia  xsaxx  aqx  ox  »XSoi,  qofq» 

» 

s xndux  l_u*°X  J  (xso«  xb)  aqx  3uo«b  Xfuo  xuaaafa  puooas  aqx  joj  Suxqoot  uaqx 


L- 


4bi  nxanple  of  a  minimal  NAA  to  find  the  second  iargest  of  -eight  inputs  is: 

1)  (Lj,  L2)  ^  Co»q>  (L1#  L2) 

2)  (Lj,  L3)  -  Coup  (L,,  L3J 

3)  (t1#  L4)  «*  Comp  (L1#  L4) 

4)  (Lj,  Ls)  ♦  Conp  (Lr  L$)  - 

5)  (Lj,  L6)  ♦  Conp  (Lj,  L6) 

6)  (Lj,  1*7)  ■*“  Conp  (Lj,  L?) 

7)  (L^,  Lg)  *►  Conp  (Lj,  Lg)  and  the  answer  appears  in  location  L2 

8)  (L2,  Lj)  Conp  (L2,  Lj)  "" 

9)  U2,  L,)  Conp  (L2,  L4) 

10)  (L2,  Ls)  Conp  (L2,  Ls) 

11)  (Lj,  L6)  ♦  Conp 

12)  (Lj,  Ly)  +■  Conp  (L2,  Lj) 

13)  (4,  Lg)  ^  Conp  (Lj,  Lg) 

This  algorithm  can  also  be  represented  by  the  following  comparator  network. 


The  Levy-Paul 1  algebra  (Levy,  Pauli,  1968)  for  sorting  algorithms  consists 
•of  two  binary  operators * (Mininun)  and  •»  (maximum) .  The  algebra  is  isomorphic 
to  a  Boolean  algebTa  without  negation.  Consequently,  the  set  of  functions  com¬ 
puted  by  NAA's  is  isomorphic  to  the  set  of  positive  Boolean  functions.  As  a 
corollary,  the  functions  computed  by  an  NAA  are  completely  determined  by  what 
they  do  on  the  input  n-tuples  of  0's  and  l's.  (This,  tells  us,  for  example, 
that  to  test  an  NAA  purported  to  sort  n  inputs,  it  is  only  necessary  to  test 


«  i  • 


*(l"uq,z«)  *(uq‘*«)  oxtdmoo  (£ 


V 


,Zq*lq 


S* 

saas  omj  aqa  ixos  (Z 


■* 


^V..,v4 


*  r 


*P®»  u  azts  JO  saas  z  ojux  sandux  aqj  (t 


•sjtaqumu  uj  jo  jas  b  jo  jno  soaqomu  asaSoBt  u  aqa  Suxpuxj  ooj. 
nqatooSxB  ue  st  SutMOXioj  aqa  ina-toaqa  aqa  jo  aauenbasuoa  aatitpauim  ub  s v 


**~^3  5  ^q*B  'aaojajaqj,  *^3  sb  a8oBi  sb  aq  jouubo 


«nmtutm  aqx  •nmmjujm  aqa  st  00  *b  oaqjaqn  uo  8uxpuadap  s,q  -  f 

pUB  S*B  \  -  T  XO  *s,q  t-(  P«B  S,B  -  T  SB  aSOBI  SB  9q  UBO  ^q  *  *B  {Z 


^**3  ?  ^q  ♦  *B 


\ 


sandux  f  +  x 


st  jBqj  s , q  C  pire  s,e  t  XBnbg  xo  treqa  JC9JB9J8  st  ^q  ♦  *b  (x  :jooo<j 


!-<♦!-  ffl  T. 
d>  q  •  * 


C+T  f  T  — ■■■  — 

'  D  ?  *q  ♦  ‘B  :u9J09iy. 

uaqa  UZ3***Z3‘t3  (s,q  qjtM  Pasoan¬ 
s')  paoapjto  Xxajaxdtnoa  uaqn  sandut  uj  aqa  Hbo  am  ji  ,Uq  >  *••  Zq  >  *q  ptre 
ub  •••  >  Zb  >  *b  aoaqm  uq‘  •  •  •  •  *q*utj‘  •  *  •*  *b  sandux  aj  oABq  o«  osoddns 

*  (nmmtxBn 

aou  '  uoxaippB  XoBUtpoo  st ‘sadtoasqirs  qatm  pasxi  uaqm  ♦  a®qa  aaaq  a  aou)  uaooaqa 

SutMonoj  aqj  uo  pasaq  st  qotqm  qtjds  xi®o  am  qaxqm  oodbih  b  aonpo-tjux  9M 

• (saxqBxosA  jautjstp  t  jo'  saonpood  aqa  xt® 

jo  nns  aqa  st  a®qa)  uotaounj  otjaaunuXs  x  *qi  st  arnlano  asaSaax  t  aqa  oj 

qa  qa 

Sutpuods9J.joo  uotaounj  aqa  auqa  umcfqs  oAuq  on  ‘aaqaJtnq  • (paaoodxa  aq  aqStn 
sb  safdna  an»Iut  1  u  uo  aou  pus  *s,x  puc  s,o  jo  y-ojdiij  andtii  r  oqa  uo  at 


By  our  theorem, the  max  of  each  of  these  pairs  is  2  Cn4jj  therefore,  the 
sot  of  sax's  is  the  n  largest  inputs. 

^We  shall  sake  much  use  of  this  splitting  operation  in  the  work  which 
follows. 

Another  operation  which  we  shall  call  upon  frequent ly^is  the  operation 
^which  nakes  an  n-1  sorter  from  an  n-sorter,  by  'peeling'.  Given  an  n-sorter, 
we  can  fora  an  n-1  sorter  by  assigning  one  of  our  inputs  to  be  the  maximum 
input  and  eliminating  all- the  comparisons  through  which  it  passes. 

This  operation  is  much  easier  to  illustrate  with  a  network  than  a  program 
but  the  translation  between  the  two  representations  is  obvious.. 

Example:  To  construct  a  3-sorter  from  a  4-sorter 


'The  x's  mark  the  comparisons  which  involve  the  maximum  input,  a.  We  can 
eliminate  these  comparisons  from  the  4-sort  algorithm  yielding  the 


The  remaining  5-sorter  is  exactly  the  same  as  the  five  sorter  designed  by 
insertion  above.  Given  these  constructions  and  any  good  general  construc¬ 
tion  for  building  a  2K-sorter  (e.g.,  the  Batcher  odd-even  sort  merge  strategy 
(Batcher,  1968)),  we'  can  now  define  a  macro  sort(K)  which  sorts  K  inputs. 

Thus  far  we  have  defined  two  macros  sort(K)  and  split (K)  which  takes 
two  K  input  sorted  sequences  as  inputs  and  splits  these  sequences  into 
two  sets  -  the  K  largest  and  the  K  smallest  inputs. 

~'We  must  also  define  a  pair  of  macros  max(n)  and  min(n)  which  find  the 
■a-g-iiiim  and  minimum  of  n  inputs;  and  a  merge  macro  (again  we  use,  for 
example,  the  Batcher  merge  (Batcher,  1968)  which  is  currently  the  best  merge 
NAA  known.  # » 

Our  attention  has  been  directed  towards  the  problem  of  finding 
the  i**1  largest  input.  We  know  that  for  the  first  or  second  (from 
either  end)  n-1  and  2n-3  comparisons,  respectively  are  required.  There 
are  no  good  bounds  known  for  the  other  outputs.  It  was  suspected 
that  an  NAA  to  find  the  median  would  be  as  complex  (require  as  many 
comparisons)  as  an  NAA  to  sort.  We  will  show  below  that  this  is  not  c 

necessarily  the  case. 

<Xir  basic  strategy  will  be  to  home  in  on  the  appropriate  output  by  a 

ser^es  sorts,  splits,  merges,  maxs  and  mins,  eliminating  inputs  as  possible  can 
didates  until  we  produce  the  final  result. 

-The  cost  of  the  sort  and  merge  macros  grows  faster  than  linearly  with 
the  size  of  the  input  sets.  Therefore,  it  would  seem  a  reasonable  tactic  to 
perform  sorts  and  merges  on  as  small  sets  as  possible  consistent  with  our 
-problem.  Thus ,  a  'divide  and  conquer'  strategy  is  suggested  for  sorts  and 
mwrges.  This  advice  suggests  that  we  begin  our  operation  by  dividing  the  in¬ 
put  set  into  smaller  groups  and  then  sorting  these  groups.  If  we  are  looking 
for  the  i**1  input,  and  we  divide  the  input  set  into  groups  larger  than  i  and 
sort  them,  then  any  element  beyond  the  i**1  of  each  group  is  beyond  the  i^ 

-of  the  whole  set  and  can  be  disregarded  in  the  later  stages  of  the  algorithm. 

(hi  the  other  hand,  if  we  divide  the  input  set  into  sets  smaller  than, or  equal 


0)U|  sandux  91 

aqa  ®PXMP  an  asxxi  ’sqaed  x«xaAes  moxioj  oa  sn  spsox  aaxAps  xno  axen 

(91  *Z)  puw  :z  a  I  dare  x  3 


*axos  o»  61  saqsa  at  : suosTXBdooa  i\  st  oiqaixoSis  sxqa  jo  asoo  eqi. 

(*)  «?■  it 
(»'»)  axids  it 
(t)  wo*  U 
(fr)  wos  (I 


*aas  utb  xoj  pasn 

aq  pinoM  xsq  xohoi  I  aas  rain  aqsa  -saaBoxpux  xsq  xaddn  aqa 


:sr  aqaixoSx*  xno  'imu 

•*t  adopB  9M  puB  jtaxdmts  aq  ox  saAoxd  X8aasxas  xaaart  aqj.  -asaSxsx  aqa  JO 

* 

ut®  aqa  puxj  pus  jsaxiBtus  aqa  Xbmb  noxqa  *jsait*®s  xnoj  PUB  asaSxBi  xnoj  oaup 
saas  ohx  aqa  axids  usa  aw  jco  Jqaxnoj  aq}  }no  qaxd  pus  saas  ona  aqa  a8xa®  u*o 
bm — saaxoqa  x®xaAas  aAeq  o«  auxod  sxqa  av  *®aqa  axos  a«  ‘paxapxo  aou  axs  saas 
aqa  aouxs  * T  uaqa  xanams  saas  ut  sainsax  saas  paits  icnba  oaut  8uxuoxaxax*d 


xaqao  Xub  -  xnoj  jo  saas  omj  oaut  aas  aqa  suoiafaxed  asxxj  aqaixoStv  aqx 
■ aoT ApB  xno  Suxmoxxoj  ut  saoToqa  naj  axa  axaqa  pue  auo  ax  duns  v  st  axdmera  sxqi 
X8*V)  P«Xd  oandmon  **a*t)  ’aqSta  jo  asaSxax  qaxnoi  aqa  puxd  x  atdmYg 

•andano  sb  asaSxvx  x’ 

aqa  saonpoxd  pua  sandux  C  saqea  qoxqw  uotaaunj  aqa  st  (f*t)  puxj  :uoxa*aoN 
:aAoqa  u3at8  aatApe  aqa  Xxdds  pus  saiduroxa  aims  a*  qoox  bm  mo# 

•saxos  io  xTBd  b  xo  sa8xan  jo  xx*d 

b  Xq  xaqax®  papaaaxd  XxqBqoxd  sbh  jt  uaqa  *a8xa®  e  qaiM  qsxuxj  an  ji 

i  t 

•saaBpxpuBa  ajqissod  jo  aos  b  pxeosip  oa  pasn 
.  sbm  anq  *aas  paxapxoun  ub  qax*  sn  ajax  qatqH  uoijrxado  UB—jxxds  b  uaaq  aABq 
asnn  daas  Suxpaaaaxd  aqa  uaqa  *>tu  jo  ut®  b  xo  axos  i?  qjin  qsxuxj  an  ji 

•andano  x*utj  r  aaaxas.oa  pasn  aq  aouuso  a? 
‘uoxaBmxojux  xapxo  sXoxasap  oxsvw  aTxds  aqa  aauxs  'andano  t  aqa  sn  xoj  spupj 
qoxqM  s andano  aqa  jo  aasqns  maos  uu  oxobui  axos  xo  nSxaui  b  asn  an  jo—  auamaxo 
ax8uxs  B  aoaxas  qOTqn  oabij  an  qatqn  soxobui  Xxuo  aqa — ojobui  (ut®  jo)  ran  B 
asn  xaqaxa  ncqs  an  ‘mqaixoflxB  (  t  aqa)  !PUtj,  s  jo  daas  x®“TJ  aqa  ux 

•-  mx  uBqa  xaasaxS  azxs: 

jo  saas  x®^ba  Xxaav®|*oxddB  oaux  aas  andux  aqa  apiAtp  pxnoqs  am  'aawrasa 
asxxj  b  sb  * axojaxaqj.  '®qaxxo8xB  eqa  jo  e8ras  arqa  as  asa8xBx  xoj 

saaspxpuBa  axqtssod  Xub  aaBuvnna  aouuso  an  uaqa  ‘uiaqj  axos  pus  t  azxs  'oa 


-£T- 


1)  two  sets  of  eight  or 

2)  four  sets  of  four  or 

3)  eight  sets  of  two 

In  each  case  we' start  out  by  sorting.  In  strategy  1,  we  eliminate  the 
mix  smallest  inputs  from  each  of  the  two  sorted  sets;  in  strategy  2,  we 
eliminate  the  two  smallest  inputs  from  each  of  the  four  sorted  sets.  In 
strategy  3,  we  can  eliminate  nothing  at  this  point.  At  the  next  point, for 
strategy  1,  we  can  merge  the  two  ordered  two-sets  and  pick  the  third  or 
split  them  and  take  the  min  of  the  max  set — both  alternatives  are  of  the  same 
cost.  For  strategy  2,  we  can  merge  each  of  the  pairs  of  ordered  two-sets 
.which  remain  and  then  merge  the  two  smallest  from  each  pair;  or  we  can  split 
the  pairs  and  sort  the  remaining  four  candidates.  Either  strategy  is  of  the 
same  cost.  Finally,  in  strategy  3  nothing  can  be  discarded,  so  at  this  point 
we  introduce  four  pairwise  splits  which  leaves  us  with  eight  unordered  inputs. 
Hien  we  can  apply  the  3  in  8  algorithm  (which  appears  in  subsection  5.  below). 
The  three  strategies  yield  the  following  algorithms. 

1)  sort  8;  sort  8;  merge  (2,2) 

or  sort  8;  sort  8;  split  (2,2);  min 

2)  sort  4;  sort  4;  sort  4;  sort  4;  merge  (2,2);  merge  (2,2);  merge  (2,2) 

or  sort  4;  sort  4;  sort  4;  sort  4; -split  (2,2) ;  split  (2,2);  sort  4 

2)  sort  2;  sort  2;  sort  2;  sort  2^  sort  2;  sort  2;  sort  2;  sort  2; split 

•  -split  (2,2);  split  (2,2);  split  (2,2);  find  (3,8). 

i  ■ 

where  find  (3,8)  is  find  3  in  8  and  involves  repeating 

■the  procedure  with  the  new  parameters . 

% 

Actually  algorithm  1  is  noticeably  more  complex  than  any  of  the  others 
4>ecause  of  the  difficulty  of  the  8-sorts  which  require  a  lot  of  computation. 

At  this  point  we  should  discuss  the  relation  between  splitting  and 
emerging.  The  split  operation  has  the  obvious  advantage  that  it  allows  us  to 
eliminate  several  candidate  answers  in  a  single  operation.  However,  it  destroys 
order  information — the  outputs  are  unordered  sets — which  may  have  to  be  re¬ 
sorted  in  the  later  stages  of  the  algorithm.  It  may,  therefore,  pay  in  some 
cases  to  merge  sorted  sequences  rather  than  to  split  and  re-sort  from  scratch. 

In  both  strategies  1  and  2  above,  it  appeared  to  make  no  difference  (from  the 
viewpoint  of  total  number  of  comparisons)  which  strategy  was  adapted. 


•(qndqno  at^  joj  Suxqoox  ueqw)  t  uaqq  Jt*XI««s  sdnojS  8ux*JOS 
!  Xq  pauses  8utqaou  sx  aJam  ‘asjnoo  jo  *puB  !qas  andut  am  jo  azxs  am  9PIMP 

'axqxssod  sc  Xxxeau  sb  *pxnoqs  qj  *qq8nos  qndqno  am  jo  Jaqonu  am  ^<1  M>iaq 
papunoq  sx  daqs  qsjxj  am  ux  8uxuos  joj  uasoip  sdnoo.8  jo  szxs  am,  {p 

*a8jam  e  ‘ jaqqBi  am  IT 

I  • 

!»aobj  8uu5Tlds  b  Xq  papaoejd  aq  XTXM  *T  ‘Jobjoj  am  ®q  03.  sx  xt  ji  (£-. 

'a8jaa  b  jo  (xbv 

jo)  uxw  b  oq  jaqxxa  xxxm  quamaie  |  ub  8uxpuxj  ux  dais  tbuxj  am  Cl 

*((u  801  u)  ami  Xxq*qojd)v 
sjndux  jo  jaqutnu  am  mTM  Xxjbouxx  ucqq  joxsbj  emoj8  (SuxSjaa  jo)  Suxuos  jo 

i 

Xjxxaxdmoo  aqq — o8jbx  ucqq  xi«®s  »jb  ipTMM  sdnojS  qjos  03  jaxsaa  sx  qj  (x 
:  aoxApB  8uxmotioj  am  uo  spuadap  ajnpaoojd  SujjBjauaS  nqqxJo8i*  jonQ. 

(fr)uitt  ?(»«»)mids  ;c»)xjq® 

'(t)qjOS  i(p‘>)JXlds  !(p*fr)XTXds  f(fr)qjOS  !(fr)qjOS  !(tO»JOS  :(tr)0JOB  (p)  JO 
Cfrj^pji  ?(»*») XX xds  ;(^*>)a8jam  (fr‘»a8jaai  ! (fr)qjos* (fr)qjos  *(»qJos  ! (*)  qjos  (£) 

(fr*f)a8jaa  !(t'fr)a8jaia  «  (p‘p)32zam  !(t)qjos  !(t)qJOS  *(*)qjos  !(tr)qjos  (j) 

.  /  *- 

,  -(>)ux«  iCfr^mtds  i(8)»JOS  !  (8) uos  (1) 

:ojb  asaqj,  *smmxjo8xB  aAxqBuxaqxB  Jnoj  03  spcax  ajnpaoojd  XBJauaS  am  »J3H 

'  (9l‘fr)  P«Xd  (£ 

*(8)UT«  •  C8*8)^Tlds  !(8)WOS  *(8)JJOS 
‘saxiu  XBJaus8  jno  mojj  smohoj  qaxqM  suox^Bjado 
jo  aauaribas  axqBuosBaj  auo  Xxuo  sx  ajam  ojaq**  axduexa  jeqqouc  sx  $XHX 

(91*8)  PUXd  Cl 

(£)ux«  »  (£*  g)XXTds  *  (»)  qjos  f(»qjos  sx  urqqxJoSxB  »X> 

*dnoj8  8uxuxbu»j  am  jo  uxw  am  b^bx  Xixbuxj  pUB  *dnoj8 
qoaa  jo  jaqatam  qqj.no j  aqq  8uxpJB3sxp  *aajm  uo  qxxds  aw  uaqq  ‘Jnoj  jo  sdnojS 
omj  aqq  Suxqjos  Xq  qjBqs  om  'ajojajaqj.  ’aqapxpuBO  snoxAqo  ub  ajB  jnoj  jo 
sdnojS- -azxs  ux  aajqq  XSBax  qB  jo  sdnoj8  peau  an  aoiApB  Jno  8uxM0Xl°d 
I  (8*  £)  P«Ti  (l 

[  saxduiBxa  ieuox»XPPV  'S 


-16- 


5)  Hie  number  of  strategies  to  be  tried  has  to  do  with  how  i  divides 
the  size  of  the  input  set. 

It  is  not  clear  yet  whether  merging  or  splitting  is  the  better  strategy 
(if  indeed  there  is  a  better  strategy).  It  is  hoped  that«experimenting  with 
computer  generated  algorithms  for  large  numbers  of  inputs  will  provide  some 
insight  in  that  direction.  This  phase  of  the  research  has  produced  the  fol¬ 
lowing  strategy  for  generating  'good1  selection  algorithms. 

A  procedure  for  generating  an  algorithm  to  find  the  i**1  out  of  n  is; 

(Note:  that  we  assume  i  s  only  the  distance  from  1  or  n  is  significant 
— and  the  results  carry  over  to  the  other  case  by  symmetry) 

1)  Partition  the  set  into  as  equal  as  possible  parts  subject  to  the 
constraint  that  each  part  be  2  i. 

2)  For  each  of  these  partitionings,  sort  the  partitioned  elements, 

i)  fry  merging  the  sorted  sets  (using  Batcher  odd-eveii  merge) 
discarding  (before  merging)  any  elements  beyond  the  ith  in  each  set. 

ii)  Try  splitting  the  sorted  sets  discarding  (before  splitting)  any 
elements  beyond  the  ith  in  each  set. 

3)  In  the  case  that  the  strategy  i  has  been  adopted,  continue  to  merge 
discarding  elements  beyond  the  i**1  in  each  set. 

4)  In  the  case  that  strategy  ii  has  been  adopted ,  repeat  the  procedure 
rfrom  step  1  for  the  resulting  set. 

5)  Compare  the  complexity  for  ail  the  algorithms  which  are  generated  and 
^select  the  best  ones. 

This  program  generator  has  been  programmed  in  ILISP  and  runs  on  the 
Rutgers  10 — it  is  called  NAA.  How  good  is  NAA?  How  good  are  the  programs 
it  generates?  It's  pretty  good,  but  not  optimal,  in  fact  it  doesn't  even 
produce  programs  as  good  as  a  clever  human  programmer.  There  are  two 
reasons  for  this  deficiency.  First  we  don't  in  general  know  any  regular 
procedure  for  producing  optimal  sort  macros.  We  don't  want  to  tackle  that 
problem  here  and  in  our  evaluation  factor  that  part  of  the  problem  out  by 
substituting  the  best  (smallest  number  of  comparisons)  know  sorting  NAA 
into  our  evaluation  procedure.  However,  NAA  failed  to  produce  best 
algorithms  for  another  reason— it  lacked  specific  domain  knowledge.  A 


*sant«A  asoq3  33am  qaxq«  sun^T-zo^XB  saonpoad  IVVN  P*1®  sasBD  t«J8Aas 
SO}  (u‘x)  puxj  J03  saniBA  umurruxm  saasjanmua  (££6I  Minujj)  ;  q3nux 
•UMOUq  soq3x.io8xB  uotjaaxds  3saq  aq3  sa3BJaua2  XWN  P®TJC^  aABq  a*  sbsbd 
Xxb  joj  *XWN  *°J  X3XXBurt3do  Moqs  03  axq®  uaaq  30U  aABq  bm  qSnoqaxV 

£  *  x  aiBupjrcx3 

Z  3Jtasux 
p  ‘l  ajBUTUTxg 

£  ojtasuT 
p  aa.Buxmxx3 

P  WOSj 

£  7X  aaBinurrxH  (9 

Z  qaasui  (S 

p  *X  a^BUTinxxa  (p 

£  a.j:asui  (£ 

p  aqBuxraxxa  ( Z 

P  wos  (I 

(9‘£)  puxd  :ax«fcnexa 

*  suosxxedtnoo  Xub  ajxnbax  30U  saop  qoxq«  u31VNIWH3u  uoxqBaado 
jaqqouB  sx  aaaqi  *u  jo  3soo  b  3®  sqjcoj*  31  *  sxaqranu  pajaos  u  03UX  xaqumu 

b  suasux  qoxqMu(y)  Xd3SNIn  uoxqBJado  »au  b  saxxnbaj  sxqx  'sassaiSoid 
qojaas  aqj  sb  auo  pa-ixsap  aqq  aq  30U  ubo  qoxqw  s3.nd3.no  Xub  SuxqBUxnrtxa  ®®T3  b 
3b  auo  aas  paqxos  aqa  oxux  s^ndux  guxuxBUiaa  aq3  sa-iasux  uaqa  puB  *  LpSoxJ 
uBqi  jcaxx«®s  x  3sa2xex  ®q3  *oj  s^ndux  q3  swos  qoxqM  uoxaBxauaS  raqaxao8iB 
03  qoBO-iddB  33BUJca3XB  us  03U0  papooua  sx  ®8paxMOuq  sxq3  *Xxxsx3uass3 

•paxa3  aABq  a«  sasBD  aq3  XI®  ux  unouq 
smq3XJo8xB  3saq  aq3  aonpozd  03  SJaqumu  aq3  3noqB  a8paxwouq  XBU0X3xppB  atuos 
sasn  qoxq«  XWN  P®TI®3  VVN  30  uoxsaaA  paAC  tax  ub  ua33X3M  ®ABq  8M 

*  sassaxSojd  raq3XioSxB  aq3  sb  uox3Bj:apxsuoo 
moaj  sa3BpxpuB3  axqxssod  suBupnxxa  03  sxaqumu  aq3  30  saxqxadojd  japao 
aq3  jo  aSpatnouq  30  a8B3UBApB  aqB3  pX*»o«  smq3xao8xB  asaq3  8uxu8xsap  uuiunq 


In  summary,  we  studied  the  automatic  generation  of  programs  for 
■the  solution  of  a  particular  class  of  problems.  Hie  succeeded  in 
•introducing  enough  expertise  into  the  program  generating  procedure  to 
produce  extremely  good  and  possibly  optimal  programs  for  that  class  of 
problems . 

References 

Batcher,  K. ,  Sorting  Networks  and  Their  Applications,  AFIPS 
Conference  Proceedings,  Vol.  32,  pp-  307-314,  Spring  1968 

Levy,  S.,  and  Pauli,  M.  C.,  An  Algebra  With  An  Application  to 

Sorting  Algorithms,  Proceedings^  of  the  Third  Annual  Princeton 
Conference  on  Information  Sciences  and  Systems.  (1S69) 

Knuth,  D.,  The  Art  of  Computer  Programming.  Vol.  3. 

Sorting  and  Searching,  Addison-Wesley,  New  York  (1973). 

Van  Voorhis,  D.,  71  Ph.D.  Thesis 

Stanford  University,  Stanford,  California  (1977) 


Published  in  IEEE  Transactions  on  Computers 


ARCHITECTURE  OF  COHERENT  INFORMATION  SYSTEMS 
A  GENERAL  PROBLEM  SOLVING  SYSTEM 


C.V.  Srinivasan 


Department  of  Computer  Science 
Rutgers  University 
New  Brunswick,  New  Jersey 


330 


IKKK  TIvANKAI'  I'UINS  ON  I'flMI'l'TKKS,  VOI.  f  23,  NO  4,  Al’KII.  lilTti 


The  Architecture  of  Coherent  Information  System:  A 
General  Problem  Solving  System 

CH1TOOU  V.  SRINIVASAN,  member,  ikke 


Abstract — This  paper  discusses  the  architecture  of  a  metasys- 
tem,  which  can  be  used  to  generate  intelligent  information  sys¬ 
tems  for  different  domains  of  discourse.  It  points  out  the  kinds  of 
knowledge  accepted  by  the  system,  and  the  way  the  knowledge  is 
used  to  do  nontrivial  problem  solving.  The  organization  of  the 
system  makes  it  possible  for  it  to  function  in  the  context  of  a 
large  and  expanding  data  base.  The  metasystcin  provides  a  basis 
for  the  definition  of  the  concept  of  machine  understanding  in 
terms  of  the  models  that  the  machine  can  build  in  a  domain,  and 
the  way  it  can  use  the  models. 

>  > 

Index  Terms — General  problem  solving  (GPS),  knowledge 
based  systems,  metadescription  systems  (MDS),  model  based 
reasoning. 


I.  Introduction 

OUR  objective  is  to  create  a  metasystem  which  can 
be  used  to  generate  intelligent  information  sys¬ 
tems  in  different  domains  of  discourse.  The  metasystem 
is  called  the  metadescription  system  (MDS).  It  has  fa¬ 
cilities  to  accept  definitions  of  description  schemas  and 
descriptions  themselves,  of  knowledge— about  facts, 
objects,  processes,  and  problem  soloing — in  a  domain. 
A  domain  might  be  a  disease  system,  a  piece  of  mathe¬ 
matics,  or  computing  systems  themselves.  The  descrip¬ 
tion  schemas  and  descriptions  of  knowledge  in  a  domain 
specialize  the  MDS  to  act  as  an  intelligent  information 
system  for  the  domain.  For  a  domain  M,  the  informa¬ 
tion  system  associated  with  it  is  called  the  coherent  in¬ 
formation  system  (CIS)  of  M. 

In  our  research  we  have  two  principal  concerns.  1) 
How  may  one  describe  knowledge  in  a  domain  to  a  com¬ 
puter;  what  kinds  of  knowledge  should  a  system  have  to 
exhibit  intelligent  behaviour;  what  operational  facilities 
are  needed  to  accept  and  use  such  knowledge?  2)  How 
may  the  computer  be  made  to  use  given  knowledge  au¬ 
tomatically  to  solve  problems  in  the  domain  and  answer 
questions? 

The  MDS  accepts  and  uses  three  kinds  of  knowledge. 
1)  Structural  knowledge  pertaining  to  the  form  and 
syntax  of  descriptions.  Descriptions  may,  of  course,  be 
strings  of  words  in  some  language.  The  MDS  wilt  trans¬ 
late  such  descriptions  to  structures  within  a  relational 
system.  The  relational  system  itself  may  consist  of  con- 


Mfimisoript  received  February  15,  l!)73.  This  wurk  was  supported 
by  the  National  Institutes  of  Health  under  Grant  HR  (143. 

The  author  i.t  with  the  Department  of  Computer  Science,  Rutgers 
University,  New  Brunswick,  N.J  08903. 


stants,  variables,  predicate  symbols,  function  symbols, 
logical  operators  and  quantifiers.  The  structural 
knowledge  specifies  the  structure  of  the  relational  sys¬ 
tem  used  in  a  domain.  2)  Sense  knowledge:  Logical  as¬ 
sertions  pertaining  to  the  sense  in  which  structures  are 
interpreted,  and  constraints  on  admissable  structures 
beyond  those  specified  in  the  syntax.  3)  Transforma¬ 
tional  knowledge:  This  pertains  to  the  knowledge  nec¬ 
essary  to  transform  given  descriptions  of  specific  objects 
to  new  ones,  according  to  specified  criteria. 

Corresponding  to  these  three  levels  of  knowledge 
there  is  a  hierarchy  of  problem  solvers,  checker-instan- 
tiator,  theorem  prover  (TP)  and  designer,  in  order  of  in¬ 
creasing  complexity.  The  checker-instantiator  system 
acts  as  a  sophisticated  data  management  system  that 
establishes,  maintains  and  updates  the  data  base  of 
models  of  specific  objects  in  a  domain  in  a  manner  con¬ 
sistent  with  the  structural  and  sense  knowledge.  Check¬ 
er  can  answer  questions  pertaining  to  any  of  the  specific 
models  for  which  the  information  is  either  directly 
stored  in  the  data  base,  or  is  directly  derivable  by  evalu¬ 
ating  a  given  logical  assertion  in  a  given  context.  The 
TP  adds  power  to  the  checker  in  three  ways.  In  certain 
cases  it  helps  reduce  the  search  effort  of  checker  by  giv¬ 
ing  it  advice  based  on  deduced  consequences  of  sense 
knowledge;  where  feasible  it  can  warn  the  checker  of  im¬ 
possible  situations  in  the  generation  and  updating  of 
models;  it  can  also  determine  general  truth  values  of  as¬ 
sertions  based  on  the  structure  and  sense  knowledge. 
The  designer  adds  further  power  to  the  system  by  en¬ 
abling  the  system  to  plan  courses  of  actions  using  given 
action  primitives  (transformation  rules)  in  a  manner 
consistent  with  the  facts  of  a  problem.  This  hierarchy 
imposes  a  very  useful  classification  of  system  facilities, 
and  gives  the  system  a  considerable  flexibility. 

The  descriptive  language  of  a  domain  is  itself  speci¬ 
fied  in  terms  of  the  model  definitions  in  the  domain. 
Language  analysis  is  thus  looked  at  as  a  model  building 
process.  Most  importantly,  the  model  definitions  in  a 
domain  may  include  definitions  of  problem  solving 
states  (PSS),  relevant  to  the  domain.  The  PSS  may  pro¬ 
vide  facilities  to  summarize  the  problem  solving  experi¬ 
ence  of  the  system.  This  summary  may  be  used  to  intel¬ 
ligently  guide  the  problem  solver. 

This  work  on  MDS  and  Cl  systems  may  he  thought  of 
essentially  as  a  further  extension  of  the  trend  started  by 
REF-ARF  [1|,  [2J,  QA4  [3|,  fobs  [4|,  STRIPS  [5|,  [Cl, 


Best 

Available 

Copy 


,  -  -  •  -  r  . 


»■-  wrj  »■%,'  w-i  - 


r- 


k 


SltiNIVA^ArJ:  roUKHKNT  INI'*  »HM  VI’ION  SYSTKM 


nnd  Phmner  |7|.  Its  probkin  solving  activity  uses 
^•V^ans-end”  analysis,  a  concept  originally  introduced 
*n  general  problem-solving  (GPS)  |8],  and  function  in¬ 
vocation  schemes  based  on  goals,  introduced  by  Plan¬ 
ner.  (II  systems  have  both  the  flexibility  ol  Ptanner-like 
svstems,  and  model  based  reasoning  abilities  of  a  ('.PS 
like  system.  The  entire  system  d  .  •  .  on  the-  way  de¬ 

scriptive  data  structures  are  org. unzed  m  a  given  do¬ 
main.  However,  the  availability  of  data- structure  and 
model  definition  iadJivic.-.,  and  a  separate*  data  manage¬ 
ment  syst»  m  makes  it.  po-c.inlf  to  cumpu  ;  Jy  •ol  .i*  da- 
data  stvjcture  and  <bo.  has.-.*  details  i»  a'.  .!••  ;v-»bi'.*ni 


reiving  prog  raw  :  »ic-  *'..i u 

the  m«:La*.v  .1-  •  the  to  create  (  .* 

.  *  d  anaic  I  reason  aide  i!>* 


i'-’e  </i 

.1 j  •  (' *. 


,> issdne  »i  o*  oioer  ‘  •  m  0'  1 


scribed  !o  .  eon-poo- 
should  Iv  at •  v  »•  *  **.' 
iein  solving  and  <*wij 


t>  m  p:  n :  -  .'l ) 

V  c  c'. i\ 


•  r  ii  10  ’•»«  f  - 
i i . *  f«>ra  ’’te 


mi  C \  sv*d “DjS  \v«-  .-re  v.  !»••■•.  i  i 


eh.nec 


now  In*  < 


If  .i  n  s  o'iu!';  he* 


»:*.i  for 


la 


uege  :.pma 
rhe  p 

.a.  are: 


atid  oi  i'/'f  -fil'  i.rr'  iff  vf 


S!  r-‘» 
U»  U‘ 


< 

•  •  ‘  :  .  I  *  ;  !  t 


OijfC 


!i  •  >! 


i  ‘  \  Ot  'Or 
•a  ;  t O  ■  t  ; 

:i  i!.«  (icla 
t  -rm!  >i  the  id  I 
.and 

1)  l.he  hlV 


•  I  '»  gV 


)!  spscuihzir.g  tin  Ai.)’  ■  * ,  opo'.ile 
'  I  I’icipistK  .-.  •:n  MlK.r.i,  iiSfUV  .)  a  ibll's'ii  I  o* 

dlS";>Uls-  • 

The  IviDS  is  m«  l.«?iniT  impb-oen!  -d  n  luterbsp. 

Nome  j '.  » is  ot  it  t Sc >n  l i  ill  Till  :■  •  -w  r  afi\  c 

paper  :  .  !i  -refore.  ;i  iU  .ill  work  CU-r  •«'.*>•  ill  p'oC 
ress.  it  i  iht>  priiifiual  ;ir.  i  •  ,  .  c*  ■\\u  . 

of  .  ;i  'I  .•iv-t itt  i c<*.*  **xt  •»'  ••  x . •  f ■  •  i * ■ 

i  pi>  nii  .snuinf’PS  Hiui  v;inni}i  ifi- 1  (N'  .Vt  uroliu’  v 
Tin'  strm'Uir  ■  o'  chooki'i-  nmi  iU— i^n.T  i  : j : ii » •> • : i .  i  . 
iijK'ration  of  the  iT  w  disi-n  ii  in  j  1  * »j . 


Ii.  An  Ovkkvii.w  ok  iiik  S'kiim 
Aut  iiii'Ki  nun-: 


A.  Templates  and  Their  Instantiations 


/Vjjivls  jn  a  domain  into  oliji  cts  n!  liiflorn 


391 


types.  Kaih  Ii  mplatfc  spncilins  :i  cortain  dcscriplion 
Ktriuturn.  Tluis,  in  the  M&O  problem  (see  'I’alile  I) 
1*1. ACK,  I'K.Ol'I.K,  VKiliri.i:,  etc.  are  different  hinds  of 
objects.  'I 'he  template  for  1*1. ACK,  for  example  intro 
ibices  two  relation  symbols:  occupants  and  position  of 
Tin*  pair  of  relation  symbols  (occupants,  occupants  o!) 
for  example,  arc  inverses  of  each  other  in  the  sense  that 
in  instances  of  l-I.ACK  and  l'KOl’l.K  the  relai  ions  O’bACK 
occupant..  PKOI'l.K)  and  (fKOl’l.K  occupants  of  l’l.ACK) 
wili  always  appear  together  in  the  data  base  of  models. 
PKOPI.K  is  just  a  list  of  KKItsONS.  An  instance  of  type 
classification  occurs  in  the  I'KIitiON  tempiate.  A  H «•-■«.- 
S'.-'-C  can  be  a  YtKc-ION AtC)  a  CANNH'JAi..  In  Mt-Il’  type 
ciassiiifi'i.iiai  alv-ays  icilects  distinctions  tn  the  way 

tiji.  i  s  a 1 1  i y  a  ! .  i  lie  t eun; :ate.s  tl ais  s', .'’CiiN  .  a*  -  ■  • 
tlita  o l  til.,  i ciui-iui-.d  sv.sm  ill  for  a  domain.*  iic  iatioil 
syni'oois  to  c  •  uffd  in  ihe  «*!: scriptum  of  \artoos  kittns  o: 
;i  ;,.ci.s  ;.  '.he  ti  .main,  a..  *  ;i*.r  kinds  i«;'  objects  that  a 
ietui.ii  i  symbol  may  relate 

ibven  sin.*/i  lei  opiate.-  may  use  i  he  inslanuator  to 


/)  The  Templates  ’I'he  concept  of  templates,  the  d 
vices  us<'d  to  specify  structural  knowlciij-.c  is  ccntim  to 
the  entire  system  architecture.  Templates  clie.sily 


nt  hinds  and 


i  iif?,."  oe.scl* 

t 


insLr*!»o«v  o!  1  ho  Ivu 


i 


i  ch  ar** 

{*-.  S-v-h  :  Jisi  Jiffs  rr.s;;ht  1’  •  sivoiliod  U>  |J|« 

;i  -or:i!-.  ex'i  i nc.i  langu.iu«!.  -id.b  trf. nskv-d  to  the  in 

i*I i : : *  fM-.'  -s  it*  * l.iori  •  .  i  «  ’.‘>1  uJK  •  s'*,  hr,  ••  *v-* 


n 


•}  ill 


••**'11  Ciiiiod 
:  •  i  l  j  r»  t :  I  i  ’ »  U 

df-t.ncd  ■  »i 
1  i>«  v  v. 

v,  W  .  iv,„ 


f»  to 


•y  tf's. 

I v:  AD  ii 


’.if  i>*  ii 
:  0,  •  |  j  t:t . i y:  J  :«  » 
ifdfii :»  n  -'Mh’. :t> 


a i. .  s  •*•(  ohj*  » •  -  v. j  Mit. 


» 


I  •.< 


...•f  *■-.  i 


..  K  .S  l  O  1 1 


'•h!-.*ni  uric*  -i.  ■  «. 
•  »nd  KH-vl  ■ 


■  t  :i\ »'  : : i  tl.  v‘v 


*.Hi« 


1  Th«*n*  art  thvrt  i.iisii  'Par*rs  mut  ihrrt*  »i-  on  <*i  •  Itink  >«  ;t 

river  Tliev  w.inl  1<i  if.i  te  llie  .<i  Ii.  r  lianU. 'I'lu  r  . . .  •»!"-  I- -.il  at 

al,1e  ||  ,  :m  carry  n»..  |»s.|>le  si  a  linw.  'Hw  . .  y*  ..  '  ..r, 

mill  Ii.ll  out  Miiintx  i  the  me  :  maries  »l.  I  lie  miw  Kare.  I'  >'»•'  '  "  ■  '* 
nl  II. in  -pul  I  mil  I  belli 


called  h-  *  ,  i',  inc!  as  manv  MiKa’ON -VHS-.S  a.ld  CA' 
BAl.’s  as  *>c(t  -ary  bach  )*KNK.ON  will  U  t*.«  <J.  CUpam 
on.  .-  pi  ,.\<’K  rnd  ih»  '■  ‘'U'.i'i.K  itself  will  be  at  one  it 
ih'-  I'b.-v  i..-  '.’ii  iv.v e  boveever.  iinroduc  i  an>  of 
the  cofuiitiia— .  a  the  p«  ..P  in.  !\o:  all  nsiant.ations  i.t 
tea  t.ir  -.’Jai  .  flit  M&t  pro'll. ’in  would  represent 

f  i.'.i  r  i.  it:.;  ioi—  ‘I'be  tiice.-a  *  add  it  "it'i  i  iii.-tia-  is 
■;e  introduci.-ri  t.\  i!ie.s"  ■  I. nou  :ede  t-'ve  •  relai*  ;v 
vinbol  in  it  .1  iu|)!c!i-  11..I  Jiavc  a  con  *,  a’c.tc'  *  audit .  ■ 

K’C)  iissocinted  will:  n  ('t  i  i:  Tohie  '  is  associated 
with  the  syiiKK’l  “occupants  It  says  that  the  CANNt- 
HAt.’s  m  l*i. Ai  t'.  canno  -  inumbe:  Pie  missionaries, 

'('he  symbol  '  I”  in  Cl ’I  rater.,  to  the  current  instance 
of  1*1. A<  K  at  which  the  CC  ini^ht  be  evaluated.  It  is 
called  the  nmhon  (i*Koin.K  \)  stands  for  “(VX)  (x  is 
I’KOIM.K)”.  AH  (’C’s  have  lie  mrin:  ”(*!  r  1’iX))’'  where 
*!  is  tin*  anchor,  r  is  a  relation  symbol  occurring  in  the 
template  as.-.oiaated  with  !,  and  l’(X)  is  some  logical 
predicate.  The  predicate  i'(\)  is  said  to  he  anchored  ct 
the  (template,  relation  symbol)  pair.  Thus,  the  predi¬ 
cate  in  |('(  ’  1 1  is  anchored  at  t IM.At  I'.,  occupants). 

In  |t'(’l|  inn  ice  lb. it  “( *!  occupants  \)  is  ilsi'li  a 
term  in  its  predicate.  Ibis  has  the  following  signili 
V  ince:  Kor  a  I'l.API'  Idle,  ,-  i.v  ItllANKl.  if  the  system  is 
told  to  set  (mi  ink  i  occupant-  y)  lor  sonic  y,  it  would 


i  a 


392 


IKKK  TUANS  ACTIONS  ON  COMI'UTF.US,  AUUIL  |‘)7h 


TABLE  I 

1.  P1.A(T:  (oaii|MnO  PIOJ’l.i:  occupants  of).  O' l 

(position  of  VTIIIL  position),  CC2 

2.  PEOPLE :  (elements  PERSON  elements  of) 

3.  VilltlL:  (elements  VEHICLE.  elements  of) 

4.  PERSON:  (type  PTYP  type  of) 

(occupant  of  PI.ACEH  occupant),  CC3 

5.  PYTP :  MISSIONARY.  CANNIBAL 

6.  PLACELl :  (elements  (PLACE,  VEHICLE)  elements  of) 

7.  PLACEL:  (elements  PLACE  elements  of) 

8.  VEHICLE:  (pilots  PEOPLE  pilots  of) 

(position  PLACE  position  of) 

(cango  to  PLACEL  destination  of) 
(capacity  INTEGER  capacity  of) 

(occupants  PEOPLE  occupants  of) ,  CC4 

[CC1]  (*!  occupants  (PEOPLE  X)|(*!  occupants  X) 

(((NUMBEROE  MISSIONARY  X)  i> 

(NUMBEROF  CANNIBAL  X))v 
( (NHMBEROF  MISSIONARY  X)  is  0)))) 

(CC2)  (*!  position  of  ((VEHICLE  X)|(*!  position  of  X) 

(X  cango  *!))) 

[CC3]  (*!  occupants  of  .*.is  1) 

[CC4]  (*!  occupants.*. s. capacity  of  *!) 


first  construct  the  combined  list  of  existing  occupants 
of  RBANKi  and  y,  and  then  verify  the  predicate.  CC’s  of 
this  kind  are  called  declarative  CC’s,  as  opposed  to  the 
other  kind,  called  imperative  CC’s,  like,  say  (for  a  hypo¬ 
thetical  template  PERSONi) 

[CS1]  (*!  sibling  ((PERSONI  x)|(NOT  (X  is  *!)) 

(X  child  of  father  of  *!))) 

[CS1]  may  be  used  to  find  the  siblings  of  a  PERSONI  in 
terms  of  the  child  of  and  father  of  relation  symbols. 
The  checker  is  used  to  evaluate  CC’s.  We  shall  discuss 
the  evaluator  in  Section  II-B. 

The  significant  points  to  be  noted  about  CC’s  are  the 
following. 

1)  The  knowledge  represented  by  the  CC’s  is  of  a  dif¬ 
ferent  kind  from  the  structural  knowledge,  specified  by 
the  templates. 

2)  Each  CC  is  specifically  associated  with  a  particu¬ 
lar  relation  symbol.  A  relation  symbol,  say  “ likes ”, 
might  be  quite  different  in  the  context  (HUMAN  likes 
SOMETHING),  from  (CATTLE  likes  SOMETHING).  A 
CC  is  invoked  and  interpreted  only  within  the  particu¬ 
lar  local  context  of  its  anchor,  within  the  overall  struc¬ 
ture  of  descriptions. 

3)  The  logic  of  the  CC’s  is  highly  dependent  on  the 
structures  specified  by  templates.  Also,  for  a  given  sys¬ 
tem  of  templates  there  may  be  more  than  one  way  of 
choosing  and  anchoring  the  CC’s.  Further,  for  a  given 
domain,  there  will  undoubtedly  be  several  ways  of  de¬ 
fining  the  templates  and  its  associated  CC’s.  These  dif¬ 
ferent  definitions  will  correspond  to  different  ways  of 
representing  the  knowledge  in  the  domain.  The  MDS 
provides  facilities  to  experiment  with  different  choices. 


At  present  we  have  no  formal  guidelines  to  make  these 
choices  intelligently.  The  particular  choices  made  in  n 
domain  will  have  an  effect  on  system  efficiency. 

2)  Instantiation  of  Templates:  We  shall  call  an  in¬ 
stance  of  a  template  ns  the  model  of  the  object  instan¬ 
tiated.  Thus,  the  model  of  RBANKi  will  he  an  instance 
of  PLACE.  Every  triplet  (jc  r  y)  (where  r  is  a  relation 
symbol)  appearing  in  the  model  x  should  be  dimension¬ 
ally  consistent.  That  is,  for  some  templates  M  and  T, 
where  x  is  an  instance  of  M  and  y  is  an  instance  of  T,  ei¬ 
ther  (M  r  T)  occurs  in  M,  or  (T  r  M)  occurs  in  T,  where 
r  is  the  inverse  of  r.  There  are  a  few  relation  symbols 
which  are  system  wide,  like  template  of,  name  of,  ele¬ 
ments  of,  arguments  of,  etc.,  which  can  appear  with  all 
instances  in  the  data  base,  and  need  not  be  defined  in 
the  templates. 

The  model  of  RBANKi  will  be  a  vector  of  five  point¬ 
ers,  say  (Pt0,  Pn,  Pe0,  P0,  Ppo)  corresponding  to  the  rela¬ 
tions  template  of,  name,  elements  of,  occupants  and 
position  of,  respectively.  Pto  will  point  to  a  pair  (Pto1, 
Pto2),  where  Ptol  points  to  the  PLACE  template,  and 
Pto2  to  possibly  local  conditions  (LC’s)  associated  with 
RBANKi.  P„  will,  of  course,  point  to  “RBANKi”.  Let  r 
be  any  one  of  the  remaining  relations:  Pr  will  point  to  a 
quintuple  of  the  form  ( #  ,P,. 1 ,  Pr  2,  P,.3,  Pr  4 ) ,  called  the 
descriptor  unit  of  Pr  (or  r).  The  elements  of  the  de¬ 
scriptor  unit  are  the  following. 

Descriptor  Unit 

Pr 4  Pointer  to  y  such  that  (RBANKi  r  y)  is  true, 
or  pointer  to  list  (y)  such  that  (RBANKi  r  z)  is 
true  for  every  z  (E  y.  We  shall  write  this  as 
(RBANKI  r  (y)). 

Pr3  Pointer  to  list  (y)  such  that  for  every  z  £  y, 
(NOT  (RBANKI  r  z))  is  true. 

Pr 2  To  local  conditions  on  values  of  (RBANKI  r). 

Pr 1  To  transformation  rules  (TR’s)  local  to 
(RBANKi  r),  called  LTR’s. 

#  The  number  of  elements  in  the  list,  set  or  trip¬ 
let  pointed  to  by  Pr*. 

Every  Prl  will  have  an  inverse,  say  Pr‘,  which  will  point 
back  to  RBANKi;  Pr4  is  the  same  as  Pr4.  The  inverse  of 
(Pto1, Pto2)  will  be  (Pto1, Pto2,  where  Pt0 1  is  the  same  as 
P,  (i  for  instance);  Pi  will  point  to  RBANKi  from  PLACE 
template. 

A  pointer  in  a  model  can  have  one  of  four  values:  not 
stored  (NS),  not  enough  information  (NEI),  NIL,  or  an 
address  (or  value).  Initially  all  pointers  in  a  model  are 
set  to  NEI.  A  list,  set  or  tuple  will  have  NEI  as  an  ele¬ 
ment  if  it  is  incomplete.  Templates  thus  specify  the 
data  structures  of  models  in  a  domain.  They  provide  the 
basic  framework  for  the  organization  of  domain  depen¬ 
dent  knowledge.  They  also  play  a  major  role  in  the  spec¬ 
ification  and  use  of  problem  solving  programs  in  a  do¬ 
main,  as  we  shall  see  in  Sections  II-B  and  II-C. 

There  are  about  fifteen  different  kinds  of  templates 
in  the  MDS.  Variations  in  the  structure  of  descriptions 
may  be  specified  by  defining,  what  are  called  variable 


KIIINIVASA.M-  COIIKKKNT  JNKOItMATION  SYSTKM 


templates.  Exceptions  to  the  CC’s  may  he  specified  hy 
associating  local  conditions  (LC’s)  will  specific  in- 
i%£?nces  of  templates.  An  LC  may  be  a  conjunctive  \.C, 
'p^LC)  or  a  disjunctive  LC  (DLC).  A  model  should  sat¬ 
isfy  ((CC  A  CLC)  V  DLC)  at  each  one  ol'  its  relation 
symbols.  Similarly,  transformation  rules  (Tit’s)  for 
changing  a  model  may  be  local  ton  model  (LTR)  or  may 
be  associated  with  templates  themselves. 

In  addition  to  the  CC’s  associated  with  pairs  (M,  r), 
where  M  is  a  template  and  r  is  a  relation  symbol,  r  may 
also  have  properties  (PR)  defined  for  it,  which  apply  to 
all  occurrences  r  within  the  relational  system.  A  typical 
such  property  is  the  transitivity  property.  For  a  model 
m,  the  PR  s  are  used  to  identify  objects  y,  such  that  (in 
ry)  is  true,  but  is  not  stored  in  the  data  base. 

All  problem  solving  programs  communicate  with  the 
data  base  via  the  instanliator  and  checker.  Templates 
also  provide  a  way  of  classifying  and  storing  the  CC’s, 
I<C s,  Tit  s,  and  Li  R  s.  Every  CC  (LC)  is  anchored  at 
(m,  r)  where  m  is  a  template  (or  model)  and  r  is  a  rein 
tion  symbol  (relation)  defined  on  m.  The  DON  list  of  a 
CC  (LC,  TR,  or  LTR)  is  the  list  of  (/«„/;)  on  which  it 
depends.  The  DET  list  of  a  pair  (m,r)  is  the  list  (in,,r, )  of 
pairs  which  depend  on  ( m.r ).  So  also,  the  DET  list  of  a 
TR  (L  1  R)  is  the  list  of  (m,,r,)  which  arc  affected  by  the 
TR  (LTR).  The  DON  and  DET  lists  are  stored  with  each 
^  anc*  kTR.  Also,  every  pair  (m,  r)  will  have  a 
^ter  to  its  associated  CC,  LC,  TR,  and  LTR. 

B.  Evaluation  of  Consistency  Conditions 

1)  The  Logic  of  Checker:  The  evaluation  of  CC’s, 
LC’s  and  PR’s  will  involve  searching  of  the  models  in 
data  base.  The  conditions  themselves  specify  the  search 
paths.  The  anchor  “*!”  may  he  used  to  optimize  this 
search.  One  can  write  small  efficient  programs  to  evalu¬ 
ate  these  conditions.  (Alternatively  one  may  compile 
each  CC  and  PR  individually.)  The  checker  is  the  inter¬ 
preter  for  CC’s,  LC’s,  and  PR’s.  It  uses  the  C,  function 
(G  for  Get)  of  the  instanliator  to  retrieve  objects  from 
data  base.  G  does  the  following: 

(Qt)C.(X  r  ?)  =  |y  |  (r  r  y)|  (This  may  include  NKI), 
or  NIL  or  NKI. 

«?2)G(X  ry)  =-  yes,  NIL  or  NKI 

«?•>’)  t:(x  ?  y)  =  p,  NIL 

where  p  is  a  relation  path  (ri,r2,  •••,  r*)  joining  X  and  Y 
(the  shortest  one).  G  just  looks  up  the  data  hase  using 
the  templates.  If  the  answer  is  NS,  NKI,  or  if  NKI  is  in¬ 
cluded  in  the  answer,  then  G  will  invoke  the  checker  to 
evaluate  I  he  associated  (Vs,  LC’s,  and  PR’s.  This  eval¬ 
uation  may  cause  the  NKI  to  he  removed  and  possibly 
add  new  elements  to  (  v). 

checker  operates  on  a  three  valued  logic  system 
haCrtig  truth  values  T  (  I  lit •»•’),  K  (kai.sk),  and  ?  (NKI). 
The  logic  of  checker  is  shown  in  Table  II.  For  a  given 
predicate  besides  reluming  its  logical  value  (one  of 
T ,  ?,  /•’),  the  checker  a!  o  returns  seven  ol  her  quant  il  I 


393 

all  of  which  will  be  subexpressions  of  P.  These  are  ex¬ 
plained  below. 

In  all  the  following  functions  assume  that  if  is  a  par¬ 
ticular  valuation  of  1  he  variables  in  P.  </>(/*)  =  T,  F,  t  ?. 

TR^(P)  True  residue  of  P.  The  subexpression  of 
P  that  caused  if(P)  =  T. 

Fft^iP)  False  residue  of  P.  The  subexpression  of 
P  that  caused  tj>(P)  =  F. 

R<t,(P)  Residue  of  P.  The  subexpression  of  P 
that  caused  <f(P )  =  ?. 

l'P,l,(P)  True  part  of  P.  The  subexpression  of  P 
that  evaluated  to  T.  <J>(P )  itself  may  be  7’, 
F,  or  ?. 

FP.jAP)  False  pari  of  P.  The  subexpression  of  P 
that  evaluated  to  F.  if(f>)  itself  may  be  7‘, 
F,  or  ?. 

NT1\(P)  Not  true  part  of  P.  The  subexpression  of 
P  that  evaluated  to  F  or  ?.  <i>(P)  itself  may 
be  T,  F,  or  ?. 

NFP, ,,(!')  Not  false  par!  of  P.  The  subexpression  of 
P  that  evaluated  to  T  or  ?. 

The  various  residue,  and  parts  of  P  are  used  in  the 
planning  phase  of  designer  to  summarize  past  experi¬ 
ences  <>f  the  system.  In  the  TP  the  residues  are  used  to 
guide  the  problem  solving  search.  The  properties  of 
these  functions  that  make  them  useful  are  given  below. 
The  logic  of  these  functions  is  shown  in  Table  II. 

Let  <ji  and  \f  denote  valuations  of  the  variables  in  P. 
Assume  in  what  follows,  that  a  certain  task  I:  was  done 
either  by  the  TP  or  designer,  and  the  outcome  of  the 
task  k  depended  on  P.  Also,  at  the  time  h  was  attempt¬ 
ed,  t  was  the  valuation  of  the  variables  in  P.  At  a  later 
time,  a  new  valuation  if  of  the  variables  P  was  obtained, 
and  the  system  had  to  decide  whether  to  attempt  h  for 
the  new  valuation. 

1)  < VPHV<f)(Yif)(<f(TRi.(P ))  -*  MP)l 

If  <f(TRt(P))  is  true;  then  try  h  again. 

2)  (V/*)(V*)<Vtf)(~*(fK,(f»))  -~„HC)). 

Do  not  try  k  if  if(FR^  (P))  =  F. 

3)  (VP)(Vl)(itf)((MNTPt(P)))(fm\(P))  —  <s (P)). 

A  goal  k  could  not  ho  reached  for  valuation  f  because 

if(P)  =  F.  Then  try  and  find  a  </>  for  which  <f(NTP+ 
(P))(<f(  //’vr(/’))  is  true.  This  is  used  to  break  up  a  goal 
into  sol. goals,  in  means-end  analysis. 

4)  (V7*)(Vv’  H3  v‘>)(UM  A\>  (/’)))(</>(  7’/ V  (CM)  -  *(/’)). 

Here  goal  h  could  not  he  reached  liccause  ( HP)  -  ?, 
ami  the  unknown  parts  of  /'  arc  given  hy  A\. (/»>.  Then 
find  a  (build  new  objects)  for  which  <M/fy(/*))  and 
(< Id  I  l\(P))  arc  true.  This  is  used  in  means-end  analy¬ 
sis,  and  in  the  TP. 

2)  Wlint  Checker  and  Instanliator  Can  /)<»;  What 
Mar,'  is  Na  iled:  The  checker  and  inslant ialur  together 
acl  as  a  fairly  sophist  ic.it  ed  flat  a  ha  e  management  sy 
tem  The  checker  makes  sure  that  data  entered  into  tin* 
dal  i  hase  are  consistent  and  also  keeps  track  of  what 
addil  ion  il  data  are  needed  to  complete  I  he  dcscripl  i<>  is 


391 


IKKK  TUANSAtTlONS  ON  ( 'OMIM '' TKIIS,  Al’HII.  I  r, 


TABLF  II 

_  _ _  l.ogic  of  Checker 

Literal  x  denotes  the  value  of  n. 


Propositions:  P,  Q.  Let  X  denote  one  of  TR,  FR,  R,  TP,  FP,  NT?,  HFP.  Then 
[X^(P  AQ)  -*X  (*P  Y%Q)]  is  true. 

Also  and  'V*  are  symmetric: 

X4(P  A  Q)  -  X^CQ  A  P);  X#(P  V  Q)  *  X#(QV  P) . 
The  various  functions  are  defined  below  for  (P  AQ). 


of  objects  with  respect  to  the  templates.  The  templates  and  choices  in  possible  valuations  of  relation  symbols, 
for  a  domain  describe  the  structure  of  the  data  base  for  Also,  checker  can  handle  only  constants  as  possible  val- 
the  domain.  The  checker  uses  this  structure  to  guide  the  uations  for  relation  symbols.  When  the  number  of  alter- 
instantiator  to  create  and  retrieve  items  in  the  data  base  natives  is  large  or  when  loops  occur  in  an  updating 
selectively.  chain,  the  checker,  if  left  to  run  will  keep  assigning  new 

The  limitations  of  the  checker  arise  in  the  automatic  values  to  the  relation  symbols  involved  until  a  consis- 
guidance  it  can  provide  in  the  updating  process.  The  tent  set  of  valuations  is  obtained,  or  until  all  known 
checker  has  facilities  to  interpret  individual  CC’s  and  to  possibilities  are  exhausted.  The  only  choices  it  can  gen- 
recognize  the  relation  symbols  whose  value  in  the  data  erate  are  those  that  are  already  available  in  the  data 
base  might  be  affected  as  a  result  of  a  change  made  at  base,  or  those  that  may  be  obtained  by  evaluating  spe- 
one  place  in  the  data  base.  Checker  keeps  track  of  the  cific  consistency  conditions  in  specific  local  contexts.  It 
relation  symbol,  by  cataloging  the  relation  symbols  in  does  not  have  the  capability  to  deduce  logical  conse- 
terms  of  their  appearances  in  the  various  CC’s.  In  gen-  quences  and  make  use  of  them  to  find  contradictions 
eral,  a  change  in  the  value  of  one  relation  symbol  might  where  possible.  To  do  this  general  theorem  proving  ca- 
propagate  through  the  data  base  to  a  series  of  other  pability  is  necessary.  The  essential  difference  between 
relation  symbol  values.  As  long  as  any  given  instance  of  the  checker  and  a  TP  is  the  following.  Whereas  the 
the  value  of  a  relation  symbol  does  not  repeat  itself  in  checker  can  assign  as  values  to  relation  symbols  only 
this  series,  checker  will  have  no  problems.  It  can  execute  specific  constants  in  the  data  base,  the  TP  can  assign  as 
the  series  of  necessary  changes  without  ever  having  to  values,  variables  with  specified  logical  properties.  The 
go  back  to  a  value  that  it  had  previously  changed  within  Tf1  can  carry  with  it  the  logical  properties  assigned  to 
the  sequence.  variables  and  use  them  in  making  new  assignments  as  it 

Checker  simply  performs  search  in  the  data  base,  and  goes  along.  Resolution  based  theorem  proving  systems 
logical  combinations  of  search.  It  has  only  simple  facili-  have  this  capability  built  into  the  unification  algorithm 
ties  to  keep  track  of  alternate  choices  in  search  paths,  (see  Nilsson,  1971). 


KKINIVASAN:  COIIKUKNT  INKOHMATION  SYSTKM 


395 


In  MDS  the  checker  will  invoke  the  TP  whenever  it 
Iocs  not  find  enough  information  in  the  data  base  to 
evaluate  a  CC  at  a  particular  anchor,  or  whenever  the 
validity  of  an  assertion  is  to  be  proven  universally;  not 
merely  with  respect  to  the  facts  known  about  the  specif¬ 
ic  objects  in  the  data  base.  The  checker  will  call  the  TP 
also  when  it  recognizes  a  loop  in  an  updating  chain. 

The  deduction  process  and  the  control  structure  of 
the  TP  in  MDS  is  different  from  that  of  a  resolution 
based  system  (see  [10]). 


B.  The  Dynamic  Aspects  of  Modeling:  The 
Transformation  Rules  and  Their  Interpretation 

1)  The  Primitives:  There  are  about  20  primitives  that 
enable  one  to  do  programming  in  a  backtracking  envi¬ 
ronment.  The  primitives  are  classified  as  shown  in  Fig. 
1(a).  The  ECP’s  environmental  control  primitives 
ECP’s  in  Fig.  1(a)  are  used  to  establish  a  control  envi¬ 
ronment  ( cenviron )  within  a  scope.  The  execution  of 
functions  within  the  scope  are  affected  by  it.  See  Table 
III  for  a  description  of  the  ECP’s.  The  sequential  con¬ 
trol  primitives  (SCP’s)  like  GO,  CONI),  etc.  There  are 
seven  active  primitives,  GOAL,  ASSERT,  DELETE, 
CANDO,  IFDON,  TRY,  and  BIND.  The  execution  se¬ 
quences  for  the  GOAL  and  other  active  commands  are 
shown  in  Fig.  1(b)  and  (c).  GOAL  invokes  appropriate 
'  \finitions  from  data  base,  and  does  “means-end”  anal- 

s  when  necessary.  ASSERT  and  DELETE  issue  /  and  D 
commands  to  the  instantiator,  when  successful.  All 
primitives,  other  than  the  control  primitives,  may  have 
CANDO,  IFDON,  and  TRY  functions  associated  with 
them.  A  primitive  can  be  executed  only  if  its  associated 
CANDO’s  are  satisfied.  If  a  primitive  fails  then  one  may 
try  its  associated  TRY  functions.  If  a  primitive  is  suc¬ 
cessful  t  hen  its  associated  IFDON’s  should  be  executed. 
Only  if  the  IFDON’s  are  also  successfully  completed  may 
the  primitive  return  success  to  its  parent.  Let  us  follow 
the  operation  with  an  example. 

2)  Interpretation  of  the  Active  Primitives:  The  syn¬ 
tax  of  the  various  active  primitives  is  shown  in  Table 
IV.  The  designer  is  the  interpreter  for  the  primitives. 
Consider,  for  example,  the  (dimension)  (see  syntax  of 
(dimension)  in  Table  IV)  of  the  GOAL  function  Till  in 
Table  V. 


((PEOPLE  X)  (PLACE  P  Q)  (P  occupants  X) 

* - (bindings) - *• 

(GOAL  (q  occupants  X))) 

•* - (fn-clause) - *■ 


The  funct  ion  call  that  will  cause  this  TR1  to  be  invoked 
is 


-  ;OPLE  x)  (GOAL  (HBANK2  occupants  X))> 
bindings)  ■•-(fn-clause)  * 

'•‘‘t  us  follow  the  interpretation  of  this  function  call,  ns 
•ptvified  in  Fig.  1(h). 

o)  Find  Possible  Bindings:  The  checker  is  used  to 


bind  variables  in  a  (dimension)  statement.  We  shall  as¬ 
sume  that  the  (proposition)  in  the  GOAI.  clause  is  al¬ 
ways  in  disjunctive  normal  form.  In  the  above  case  X 
will  be  bound  to  (Ml  M2  M3  Cl  C2  C'3).  If  the  checker 
returns  NE1,  or  a  loop  is  encountered  then  the  TP  may 
he  invoked  to  complete  the  bindings.  Unless  the  IDB 
clause  (see  Table  III  for  an  explanation  of  the  IDB 
clause)  is  present  the  TP  will  create  new  objects,  if  nec¬ 
essary  to  complete  the  bindings. 

b )  Find  Initial  Conditions:  This  is  done  by  checking 
whether  the  GOAL  is  already  satisfied  in  the  data  base 
for  specific  bindings  of  variables.  In  the  case  of  our  ex¬ 
ample,  this  will  bring  out  the  fact,  (rbAnki  occupants 
(Ml  M2  M3,  Cl  C2  C3)).  This  will  cause  the  following 
invocation  pattern  to  be  built: 

((PEOPLE  (xi  —  (Ml  M2  M3  Cl  C2  C3))) 

(PLACE  (X2  --  RBANKI )(X2  occupants  Xl) 

(X3  •*-  RBANK2) 

« - (bindings) - *■ 

(GOAL  (X3  occupants  X i )))  - (fn  clause) 

Let  b  be  the  (bindings)  and  g  the  (proposition)  of 
the  (fn  clause).  The  canonical  form  of  an  invocation 
pattern  (also  a  (dimension)  after  binding  the  variables) 
is 

((bound  quantifiers)  (feigi  v-v  ( bmgm ))) 
where  each  6,  and  g,  is  a  conjunction  of  terms,  possibly 
with  OPNL,  IFND,  IDB  or  *  clauses.  Let  D  be  an  invo¬ 
cation  pattern  and  D,  any  (dimension)  in  the  data  base. 
Di  and  D  are  said  to  match,  (D,  -*•  D)  if  there  exist 
bindings  for  the  variables  in  Di  such  that  for  some  6,;  in 
Di  and  in  D  (by  -*•  b k)  and  (gy  —  gk),  and  in  addi¬ 
tion  bij  is  true  in  the  data  base.  The  fey  will  be  the  ini¬ 
tial  conditions.  The  invocation  process  will  retrieve  all 
Di  that  match  D. 

If  no  such  functions,  Di,  are  available  then  the  de¬ 
signer  will  force  the  GOAL  by  issuing  the  appropriate 
ASSORT  and  DELETE  commands.  These  will  cause  their 
associated  CANDO’s  to  be  executed.  If  the  CANDO’s  suc¬ 
ceed  then  the  corresponding  I  and  D  commands  will  be 
tried.  This  will  cause  the  associated  CC’s  to  be  evalu¬ 
ated  at  the  given  bindings  of  the  variables,  say  \p-  If  the 
CC’s  are  not  satisfied  then  the  not  true  part  of  \p 
(NTP^,)  and  true  part  (TT\)  will  be  issued  as  subgoals. 
If  the  CANDO  s  are  not  satisfied  then  the  goal  will  be 
abandoned. 

In  general,  both  the  binding  and  invocations  pro¬ 
cesses  will  return  more  than  one  possible  course  of  ac¬ 
tion.  In  both  these  cases  the  problem  solver  needs  to  be 
guided  intelligently  in  making  its  choices.  The  DE¬ 
SIGNER  has  some  built-in  facilities  for  intelligent  se¬ 
lection  of  choices  from  a  set  of  alternatives.  The  prob¬ 
lem  solving  state  (PSS)  provides  this  guidance.  'Phis  is 
discussed  in  the  next  section. 

.1)  The  PSS:  The  PSS  itself  is  defined  by  templates. 
The  PSS  template  is  shown  in  Table  VI.  ’Phis  table  is 
self-explanatory.  Every  time  the  designer  invokes  a 
function  or  executes  a  (In-call)  it  will  create  an  instance 


:»!*« 


IKKK  TUANSAl'TIONS  ON  I'OMI’l'TKHS,  Al’llll 


huniiivis 


k-R-.HIT  IVES  PRIM1IIVFS 


FHVISONWNIAL  SFOULNFIAL 

COSHROL  CONTROL 

PRIMITIVES  PRIMITIVES 

U) 


GOAL  FUNCIIftT  CALL 

Create  a  PSS  instance.  psf.  to  record  the  call  and  enter  all  thl  appropriate 

INFORMATION  IN  PSS  LINKINC  IT  TO  OTHER  PSS  INSTANCES, 


FIND  ALL  POSSIBLE  VARIABLE  BINDINGS 
Enter  sindings  in  pss. 


* 


(b) 


Fig.  1.  Classification  of  primitives  and  their  execution  HcquenceB. 


SKINIVASAN:  cohkuknt  infokmation  systkm 


397 


ACTION  FUNCTION  CALI  (not  a  GOAL  fn.)  Create  a  PSS  iNiiAurt,  pss,  to  rfcord  the  call  and 

INTER  ALL  APPROPRIATE  INFORMATION  IN  PSS,  LINKING  IT  TO  OTHER  PSS  INSTANCES. 

1 

FIND  ALL  POSSIBLE  VARIABLE  BINDINGS 
Enter  bindings  in  pss. 


| - No— 


V 


-<ANY  BINDINGS  LEFT?> 


YIS 


RETURN  FAILURE 
(Update  pss) 


CHOOSE  A  BINDING  USING  pss.  Enter  choice  in  pss. 


IF  FUNCTION  IS  ASSERT  OR  DELETE  THEN  CHECK  DATA  BASE: 
<1$  STATEMENT  ALREADY  TRUE7> —  No-*-, 


I 

Yes 

i 


RETURN  SUCCESS 
(Update  pss) 


No  -<ANY  CAND0'sT> 

Yes 

<EXECUTE  ALL  CAND0>-  fail-*] 

SUCCESS 


r 

No 


-<ANY  IFDON's?><-success<£XECUTE  FUNCTI0N> 

*  4S 


Yes 


v 

SUCCESS 


FAILURE 


< EXECUTE  ALL  IFDON>|fail—  <ANY  (MORE)  TRY's?> 


-No->- 


-<  EXECUTE  TRY>— >fail 


RETURN  SUCCESS 
(Update  pss.) 


(c) 

Fig.  1.  Continue 


of  PSS  corresponding  to  the  function.  The  network  of 
all  such  PSS  instances  is  the  problem  solving  protocol. 
The  CC’s  associated  with  the  PSS  template  provide  the 
necessary  guidance  to  DESIGNER.  Of  particular  inter¬ 
est  are  the  CC’s  associated  with  the  bindings  and  alter¬ 
nates  (see  Table  VI)  relations.  Let  us  call  these  [CCBj 
and  [CCE],  respectively.  These  CC’s  will  specify  the 
choices  of  current  bindings  and  current  function.  Two 
important  notions  that  make  this  possible  are  the  no¬ 
tions  of  similarity  of  two  PSS  instances,  and  cc  summa¬ 
ry  of  a  PSS  instance. 

cc  summary — (CCS):  A  CCS  is  a  record  of  evalua- 
y'. ns  of  CC’s,  branching  conditions,  OANDO  conditions 
'nitd  binding  conditions,  made  during  the  tenure  of  a 
PSS  instance.  For  each  sequence  of  conditions  evalu¬ 
ated,  the  CC  summary  will  contain  the  TRUK  KKKlOUK’s 
of  the  conditions  evaluated  if  the  condition  evaluated  to 
TRUK,  the  KAl.SK  RKSIDUK,  NO  T  TRUK  PART  and  TRUK 


PART  if  the  condition  evaluated  to  FALSE,  the  RESIDUE 
if  the  condition  evaluated  to  NEI.  It  will  also  have  the 
outcome  (fn  state)  of  the  PSS  instance  in  which  the  con¬ 
dition  was  evaluated,  and  specific  variable  bindings  if 
any  in  terms  of  the  kinds  and  types  of  objects  used.  All 
variable  bindings  in  the  CC  summary  of  a  PSS  will  be 
specified  in  terms  of  the  variables  that  appear  in  the 
bindings  of  the  PSS.  The  concept  will  become  clear  in 
the  example  considered  below.  The  consistency  condi¬ 
tion  [CCB]  uses  CC  summaries. 

The  general  rule  is: pick  for  bindings  the  same  kind 
and  type  of  objects  that  previously  succeeded  in  similar 
PSS  instances;  do  not  pick  t  he  kind  and  type  of  objects 
that  previously  failed.  Use  cc  summaries  to  check 
whether  a  chosen  binding  is  likely  to  succeed.  If  no 
bindings  could  he  picked  by  the  above  rules,  then  pick 
arbitrarily. 

To  define  the  notion  of  similarity  of  two  PSS  slates 


IKKK  TKANSAiTlONS  ON  LOMI*! *TKHS.  Al'KII.  \Wa\ 


TABLE  III 

The  Control  Primitives 


ECP;  ENviMONMfNtAL  Control  Primitives 


l.  SUPPRESS 


2.  CONTEXT 


SOP:  Sequent  Ml  Control  Primitives 


SUPPRESS  execution  of  specified  class  of  functions 

1. 

GO 

Go  TO  A  LABEL. 

within  a  scope:  Ex:  (SUP  CAN  DO)  cbouy  of  program)). 

2. 

BKIRK 

BACK  TRACK  to  a  label. 

Sets  the  context  in  which  the  probiem  solving  is  to 
take  place.  It  can  be  a  domain,  a  class  of  problems, 

OR  THE  VERSIONS  of  The  MODELS  in  THE  data  base,  that 

5. 

COND 

Like  LISP  COND.  Back  tracking  is  allowed  under 

FAILURE. 

ARC  TO  BE  used.  Ex :  (CONTEXT  version23),  or 
(CONTEXT  1973). 

n. 

KILL 

KHL  a  function  ano  fake  it  to  show  SUCCESS  or 
FAILURE,  as  specified  in  the  argument. 

Fixes  the  direction  of  search  in  searches  over  sub¬ 
sets  Of  A  set.  The  search  is  to  start  with  the 

5. 

SUSPFKD 

Suspend  execution  and  save  current  state. 

smallest  subset  and  proceed  upwards.  Similarly, 
there  is  also  a  DOWN.  Ex:  <<UP(S0ME  PEOPLE  X) 

6. 

ACTIVATE 

Reactivate  a  previously  susplnoeo  function. 

(X  OCCUPANTSOF  RBANxl)). 

A  DISJUNCTION  of  goals  is  to  holo.  Ex:  (DSJN 

7. 

OPTIONAL 

Failure  of  ah  OPTIONAL  clause  or  action  mill  not 
normally  cause  backtracking. 

(goal  stmt)...  (goal  stmt)). 

A  CONJUNCTION  of  goals  is  to  hold.  Ex:  (CNJN 

8. 

IF NEE OLD 

Back  tracking  can  occur  only  if  the  If NEEDED 
clause  or  action  also  fails. 

(goal  stmt)  ...  (goal  stmt)  ). 

RELEASE  a  previously  suppressed  class  of  functions. 

9. 

REPEAT 

REPEAT  until  the  condition  of  the  repeat  is 
SATISFIED. 

Ex:  ((RISE  CANDO)  <boov  of  program)  ).  '* 

Use  only  the  models  available  IN  the  DATA  BASE. 

Do  NOT  create  any  new  objects  to  satisfy  a  binding 
condition.  Ex:  (IDB  (body  of  program)  ). 

DESIGNER  CANNOT  LEAVE  A  -ED  ITEM  IN  A  CHANGED 
STATE  AT  THE  END  OF  A  PROBLEM  SOLVING  SESSION. 

Ex:  (’(VEHICLE  V))(V  location  nbank2)...). 

No  BACK  TRACKING  ANYWHERE,  THROUGHOUT  THE  SCOPE. 

Ex:  (R6T  <BODY  OF  PROGRAM  >  >. 


An  Example;  (OPNUREPEAT  <  Termination  Condition)  (SUP  TRYX...) 
(...HASSERT...))) 

The  REPEAT  clause  is  in  the  scope  of  OPNl.  Hence  no  back  tracking  will 
occur  on  FAILURE.  Within  the  scope  of  REPEAT  all  executions  of  TRY 
functions  are  to  be  supprfssed.  One  may  also  thus  selectively  suppress 
THE  CANDO  AND  IFDON  FUNCTIONS,  OB  ANY  OF  THE  ECP'S  THEMSELVES,  WITHIN 
A  SCOPE. 


The  Syntax 


BLE  IV 

Transformation  Rules 


(A)  GOAL-HULLS: 

GFNOEFN  - 

GDIKENSICN  - 

GFN-CLAUSE  - 

GPROPOSITION  - 


(action  rules) 
(action)  ■ 


(  <gdihension)  <  body)  ) 

(  <  BINOINGS)  ("GEN-CLAUSE  >  )j 

(GOAL  <  gpropos i T i on  )  ) ; 

A  conjunction  of  relations  of  the  form 
i(x  *  r)  o*  (x  r  r),  or  (»  xl  x2,.,xn) 

<  PREDICATE  > 

(OPTIONAL  (predicate)  ) 

(JFflEEDED  <  PREDICATE)  ) 

(IDB  (predicate)  ) 

(bindings)  (bindings)  ; 

A  PROPOSITIONAL  expression  OF  relations, 
which  may  incluoe  OPTIONAL,  IFNEEDED,  etc. 
CLAUSES,  and  UP,  DOWN  modifiers  of  quanti¬ 
fiers,  Of  the  forms  (UP  (SOrtF.  x)), 

(DOWN  (ALL  x))  ETC. 


(  (bindings)  (  (action)  (6PR0POSIT10N)  )) 

ASSERT  f  DELETE  | FORCE. 


<  IFCAII  >  - >  CAHDO  IFDO.'IE 

TiFCAN  rule)  - )  (  <[FCA'I>  (dimensioh)  <  IFCAflBoor)  ) 

<PlttllS10K>  - )  (  ( BINDINGS  )  <F»  CLAUSE  >  ) 

<  FHCLAUSE  >  - )  (  <FH  >  <OPROPOSITION  >  ) 

<  FN  >  - )  GOAL  |  ASSERT  |  DELETE  |  FORCE 

< [ FCL'i&ODY  >  — — >  OFCANsrArtMENT  )|^IFCAYbody)  (  IFCANstaukeht  ) 

<IFCAN 

STATEMENT  )  - )  (<  DIMENSION)  <  TRY-FN  )  )  |  (  <  BINDINGS  >  (  TRY-FK  >  ). 

<TRY-FN>  - )  <TRY  <  BINOINGS)  <B0  DY>  >  |  (TRY  Tdimension)  <mdt  >  ). 

(D)  FUNCTION  CALLS! 

<FNCALL>  - »  (TRY-fn >  |  (OPTIONAL  <body)>| 

(IFIIEEDED  <body)  )| 

(IDB  <  BOOT  >  >| 

(SUPPRESS  <FN-CLAUSE>  )l 
(SUSPEND  C  DIMENSION )  )| 

(ACTIVATE  <  DIMENSION)  )| 

CCOIID-stmt)  |  (00  < label)  >| 

(BKTRK  < label)  >  |  <  BIKD-stmt) 

(This  is  line  SET  in  LISP)  | 

<  dimension)  |  (PROC  <  bindings)  (boot)  ) 


we  need  some  additional  concepts.  Let  k  be  an  arbitrary 
PSS  instance  defined  as  follows: 

h:  (dimension  D*)  (bindings  ft*)  (initial  state  /*) 
(alternates  A*)  (fn  state  ft*)(cenvirons  ft'*) 

(cc  summary  CCft'*)(history  //*)  (type  7*) 


(final  state  FS*)  (successor  (/*)  (predecessor  ft*) 
(conditions  C*). 

Let  vk  =  |Xi,X2)  •  •  •,  X„|  be  the  variables  appearing 
in  the  (dimension),  /)*.  Let  Du  itself  be  (Q*/,*ft*) 
where  Qu  is  the  (quantifiers)  of  Du,  P*  its  (proposi- 


SRINIVASA!*  COIIKRKNT  INFORMATION  SVSTKM 


TABU:  V 

The  Transformation  Rules  for  the  MAC  Problem 

f  (Titi)  goal  ii!';,  no;;  w.i  ismu:;.  it  says,  -tick  up  sums  piopi.i  r,  possim.r  as  many  as  tiii.  vijiicli:  wii.i.  hold 
wad  m  viiiiiCLL  mini  y.  and  tack  tiilm  to  <)".  -inoc"  in  tiii  statkmi.xt  m:uw  stands  iur  procliuiri.. 

!  IT  IS  SIMILAR  TO  PROG  IN  LISP,  WIT  DIFFERS  FROM  PHOT.  IN  Till  SI-.NSL  THAT  PROC  HAS  BACKTRACKING. 


(((PEOPLE  X) (PI  ACE  P  Q) (P  occupants  X)(COAL(Q  occupant  s  X)) 

(REPEAT  <Q  occujiants  X) 

(PROC  ((SOME  VUiifcU:  V) (SOME  PEOPLE  Y)(OPNL(Y  ».is. capacity  of  V)(Y  nnotip.  X)) 

].  (ASSERT  (V  occupants  Y)(N'JT(P  occupants  Y))) 

2.  (ASSERT  (V  position  Q)(KOT(V  posit  ion  P))) 

5.  (ASSLRT  (V  occupants. a re. occupants  of  Q)(NOT(V  occupants  Y)))))) 

(TR2)  CANDO  CLAUSE  FOR  LOADING  A  VEHICLE.  IT  SAYS,  "UNLOAD  THE  VEHICLE  FIRST.  IF  THERE  ARE  MORE  PEOPLE  THAN  ' 
THE  VEHICLE  CAN  HOLD  THEN  DROP  SOME  AND  TRY  AGAIN.  IF  Til:  VEHICLE  IS  NOr  AT  THE  SAME  PLACE  AS  THE 
PEOPLE  ARE,  THEN  BRING  IT  10  WHERE  THEY  ARE.  EACH  TRY  STATEMENT  IS  PRECEDED  BY  THE  CONDITIONS,  ON 
WHOSE  FAILURE  THE  TRY  WOULD  BE  ATTEMPTED. 

(CANDO  ((PEOPLE  X)  (PLACE  P) (VEHICLE  V)(P  occur-  nts  X)  (ASSERT (V  occupants  X)  (NOT(P  occupants  X))) 

1.  (ASSERT (V  position  P)) 

2.  ((PEOPLE  Z )  ( V  occupajrit^s  Z)(ASSLRT(P  occupants  7.)(N0T(V  occupants  2)))) 

3.  ((X  *.<.cap.ic ity  'of  V) 

(TRY  ((SOME  PEOPLE  Y)(Y  among  X)(ASSF.RT(V  occupants  Y)(NOT(P  occupmts  Y)))) 
(IHX>N(KILL(ASSr.kT(V  occupants  X)(XOT(P  occupants  X)))))DT 

(TR3)  CANDO  CLAUSE  ASSOCIATED  KITH  BRINGING  A  VEHICLE  FROM  A  PLACE  P  TO  Q.  IT  SAYS,  "GET  A  PILOT  AND  TAKE 
THE  VEHICLE .  IF  PILOT  CANNOT  BE  REMOVED  GET  SOM!  ONE  TO  GO  KITH  HIM”. 

(CANDO  ((VEHICLE  V) (PLACE  P  Q) (V  position  P) (ASSERT (V  position  Q)(NOT(V  ’OS it ion  P)))) 

1.  ( ( (SOME  PERSON  Y) (V  pilot  Y)(V  occupant  Y)) 

IA.  (TRY ((SOME  PERSON  7.)'(V  pilot  Z)(P  occupants  Z) 

(ASSERT  (V  occupants  Z)(NOT(P  occupants  Z))))) 

IB.  (T RY ( (SOMi:  PERSON  X  Y)((V  pilot  X)  (V  pilot  Y))(P  occupants  (X  Y)) 

(ASSERT  (V  occupant  s (X  Y)T(N01(P  ocaipfnts  (X  ?))  )T)T) 


TABLE  V! 

PSS  Template 


(dimension 
(inti,  state 
(bindings 


(CC- 5UD’.  Ml') 


(cenv i von 


|  (successor 
I  (predecessor 


DIMN)  - - —  :  <diir.cnsion>  of  tVe  function 

MODEL  STATE) - :  Initial  State  .  f  Model 

BINDINGS)- - :  All  possible  vari  '  le  bindings  and  current  bindings 

PSSL)---  . :  PSS  instances  of  possible  alternates;  also  the  current!/  active  function 

CCSL)-  - - —  :  suraary  of  EC's  evaluated  during  the  active  tenure  of  fn 

FNSTATF.) . :  Function  States:  ACTIVE:  SUCCESS,  FAILURE.  SUSPENDED 

CFNVIRON) . :  The  Control  environr;  nt 

PSSL) . *  •:  Previous  instances  of  PSS  with  the  same  dimension 

rSSTYP)  -  -  -  -  -  -  -  :  Any  useful  type  classification  of  PSS 

(final -state  MODELS  TATE) . :  Model  eh  -  ges  done  by  the  PSS  instance 

rSSL)-  -  -  - . :  Possible  successor  fund  ions 

PSSL) - List  of  parent  functions 

CONDN) - _  Conditions  appearing  in  REPEAT,  COND  and  other  such  st  itements  that  caused 

the  current  i’SS  to  he  invoked. 


tion)  in  d  h'k  it*  (f»  clause)  (sec  syntax  of  (dimension) 
in  Table  IV).  Let  dr-  |(v,  -  A,\  j  1  <  i  <  m|  be  a  spe¬ 
cific  binding  of  t lit-  variables  in  V*  such  that  d*(/\)  is 
The  bindings  lit,  is  the  set  of  all  such  d*.  The 
initial  ntnte  Ik  i*  a  conjunction  of  terms  defined  on  I  he 
variables  in  V*  and  possibly  some  con  ;lants  in  111  data 
base,  l  et  A’e  ~  |  V|.  A  .  •••,  A'„|  be  the  I'onstanls  in  //.. 
In  each  I’SS  instance  A.  I  j,  is  the  same  as  the  final  state 
of  its  i>rcth ■iv.sjior,  together  with  whatever  might  have 
been  added  In  the  function  invocation  prociss  per¬ 


formed  in  k,  all  expressed  in  terms  of  the  variables  in 
Vk  and  constants  in  A'y.  For  every  ih  (-.  Ih  i* 

true  for  the  given  eon  (ants  in  t V/,.  Not  all  of  t In-  con¬ 
stants  in  \\  might  be  used  in  the  I’SS  instance.  The 
used  ones  are  precisely  tliose  that  a|)|K-nr  in  the  CV 
summary  of  the  I’SS  instance.  Let  A'*  ''  denote  the  used 
ones  in  k. 

Tvo  I’SS  instant  es  hi  and  -/  are  similar  if: 

I)  (hex  have  the  s  ons  dimension  I h  Itj.  Thus  I’;; 

\\i,  and  hi  and  ./  are  in  the  same  history  list,  II),  - 


IKK.K  'I’ll  A  NS  ACTIONS  ON  COMI'OI  KHS.  ACltll. 


100 

H.i  Also,  the  typo  of  K  is  equal  to  the  typo  of  J. 

2)  they  have  the  same  control  environment  Ek  Ej; 
and 

:t)  Vk.A/k-  satisfy  one  or  more  of  the  cc  summaries  in 
J,  for  some  binding  /i*  £  fl*.  'I’hus,  Vj,Nj  satisfy  one 
or  more  cc  summaries  in  K  for  some  binding  (E  B,y. 

K  is  identical  to  J  if  Dr  =  Dj,  Tk  ~  Tj,  Ek  -  F.j,  Hk 
-  B.j,  ar.-.  /Vka  =  /VjA. 

To  understand  how  these  work  let  us  consider  the  so¬ 
lution  of  the  M&C  problem. 

-/)  M&C  Problem  Solution:  The  sequence  of  possible 
function  calls  is  shown  in  Fig.  2.  Each  box  in  Fig.  2  is  la¬ 
beled  to  indicate  its  correspondence  to  the  functions  in 
Table  V.  The  boxes  are  numbered  1  through  15.  For  a 
box  with  number  i,  let  Ki  denote  its  associated  PSS  in¬ 
stance. 

Suppose  we  are  at  the  beginning,  and  are  at  box  3  in 
Fig.  1.  Then  the  following  sequence  of  actions  might 
happen  (follow  arrows  in  order): 

i 

(Y  (M 1  M2))  — *•  enter  box  4 

\2  \3 

l(V  occupants  Y)  enter  box  5 
D(p  occupants  Y) 

The  indicated  I  and  D  commands  are  returned  by  the 
ASSERT  function,  TR1-1  (see  Table  V).  This  will  cause 
[CC4|  and  [CCl]  to  be  evaluated,  and  the  following  cc 
summaries  to  be  returned  to  box  3  (since  box  3  is  still 
active)*:2 

[CCS4][K4, 1(V  occupants  Y),  (V  —  VEHICLE) 

(Y  •*—  (M,M)),  T,  Fail] 

[CCSl][/f4,  D  (P  occupants  Y),  (p  —  PLACE) 

(Y  —  (M,M)),  F,  Fail] 

Here  /f4  is  the  PSS  instance  at  which  the  e\  .luation 
took  place.  The  variable  bindings  are  indicated  only  in 
terms  of  the  kinds  and  types  of  objects  used.  T  and  F 
are  the  CC  evaluation  outcomes,  and  Fail  is  the  outcome 
of  ff4.  Notice  that  moving  these  two  cc  summaries  to  K% 
makes  it  still  possible  for  K 3  to  use  these,  because  the 
variables  P  and  V  have  in  K4  the  same  bindings  as  they 
do  in  K$  and  none  of  the  terms  appearing  in  the  CC’s 
have  changed  in  value  between  and  K4. 

The  failure  of  /C4  brings  us  back  to  K3.  Now  the 
choice  of  next  bindings  to  be  tried  will  be  guided  by 
[CCB].  Either  a  (M,C)  or  a  (c,c)  will  succeed.  A  more  in¬ 
teresting  case  is  the  following.  Now  suppose  that 
(Ml, Cl)  are  already  on  RBANK2.  This  would  have 
caused  the  following  series  of  successful  cc  evaluations: 

[CCS4']: [K\,  I(V  occupants  Y),  (V  —  VEHICLE) 

(Y--  (M,C))  •  T  •  SIJC] 

[CCSl']:[/C4,D(P  occupants  y),  (p  --  PLACE) 

(y  ♦—  (m,c)),  t,  sue] 

'*  We  ehall  use  M  fur  missionaries,  O  for  cnnnilmls,  Enil  for  failure,  T 
for  true,  SUC  for  success,  nrnl  K  for  false. 


[CCS2]:[/\  hi,  l(V  position  Q),  (()  -  PLACE) 

(V-  VEHICLE),  T,  SUC] 

[CCSl"|:[/vir„  l(Z  occupants  ofq),  ((}  *  PLACE), 

('/  --  (V  occupants  ?))  (V  --  VEH1CLE),T,SUC] 

All  these  cc  summaries  would  now  be  available  at  box  3 
of  Fig.  2,  since  PKOC  would  have  been  still  active  during 
the  whole  course  of  events. 

After  this,  PKOC  will  be  reinvoked  because  REPEAT 
will  have  been  still  active.  The  new  instances  of  PSS  for 
boxes  3  and  4  will  be  created.  These  will  be  similar  to 
the  previously  created  instances.  The  BOAT  is  now  at 
RBANK-2.  The  CANDO  clause,  box  5,  (statement  TR2  in 
Table  V)  will  now  cause  the  BOAT  to  be  brought  back  to 
RBANKi  with  a  pilot,  who  in  this  case  will  have  been  the 
missionary  Ml.  This  time  when  a  new  pair  of  PEOPLE 
are  picked  from  RBANKi,  the  system  will  already  check 
for  the  satisfaction  of  the  successful  path  of  CC  execu¬ 
tions,  depicted  by  the  summaries  [CCS4'],  [CCST], 
[CCS2],  [CCSl"].  Picking  another  (M,C)  will  in  this  case 
fail;  (C,C)  will  succeed  in  satisfying  the  cc  summaries. 
Thus,  with  anticipation  the  system  will  pick  the  right 
candidates  likely  to  lead  to  success.  From  here  on  the 
availability  of  the  cc  summaries,  and  the  guidance  pro¬ 
vided  by  [CCB]  will  enable  the  system  to  always  pick 
the  right  candidates.  The  following  solution  will  be  ob¬ 
tained. 


RBANKI 

Step  1:  (Ml,  M2,  M3,  Cl, 
C2,  C3) 

Step  2:  (M2,  M3,C2,  C3) 
Step  3:  (Ml,  M2,  M3) 
Step  4:  (Cl,  M3) 

Step  5:  (Cl,  M3,  Ml,  C2) 
Step  6:  (C1,C2) 

Step  7:  (Cl,  C2,  C3) 

Step  8: (Cl) 

Step  9:  (Cl,C2) 

Step  10:0 


RBANK2 

0 

(Ml,  Cl) 

(C1.C2,  C3) 

(Ml,  M2,  C2,  C3) 

(M2,  C3) 

(Ml,  M2,  M3,  C3) 

(Ml,  M2,  M3) 

(C2,  C3,  Ml,  M2,  M3) 

(C3,  Ml,  M2,  M3) 

(Ml,  M2,  M3,  Cl, 

C2,  C3). 


Step  5  in  the  above  solution  is  caused  by  box  14  in 
Fig.  2. 

5)  Summary:  Thus,  the  designer  provides  the  high- 
level  control  structure  necessary  to  pass  on  to  the 
checker  the  right  CC’s  to  be  evaluated,  and  to  the  in- 
stantiator,  the  right  model  changes  to  be  done.  The  de¬ 
signer  programs  themselves  are  independent  of  the  de¬ 
scriptive  data  structures  used.  Again  the  templates  and 
instantiator  provide  a  desirable  isolation.  The  PSS  itself 
may  be  changed  for  different  domains  of  discourse,  of 
different  problem  types.  In  this  sense,  the  templates 
and  the  rules  of  transformation,  together  witli  the  PSS 
specialize  the  M1)S  to  n  given  problem,  or  n  given  do- 


401 


S 


la 


SHINIVASAN:  .COIIKKKNT  INFORMATION  SYSTKM 


H 


Fig.  2.  Graph  of  function  calls  in  the  solution  of  the  M&C  problem. 


■  ** 


main  of  discourse.  The  problem  solving  control  struc¬ 
tures  are  driven  by  the  domain  dependent  data.  The 
checker,  TP,  designer,  and  instantiator  are  all  part  of 
the  MDS. 

Most  importantly,  there  is  a  significant  stratification 
of  knowledge  in  a  domain,  as  seen  by  the  system.  Do¬ 
main  dependent  knowledge  is  made  available  to  the  sys¬ 
tem  as  templates,  as  CC’s  or  as  TR’s.  The  PSS  tem¬ 
plates  play  a  particularly  important  role.  Depending 
upon  how  and  where  a  given  piece  of  domain  dependent 
knowledge  is  specified  the  system  uses  it  differently. 

The  relative  isolation  of  the  problem  solving  and 
model  management  programs  from  the  descriptive  data 
structures  themselves,  make  the  concept  of  MDS  feasi¬ 
ble.  The  facility  to  arbitrarily  specify  descriptive  data 
structures  as  well  as  nondeterminist.ic  programs  makes 
the  system  highly  flexible  and  powerful.  The  checker 
and  instantiator  provide  the  basic  foundation.  These 
two  systems  are  small  systems,  and  the  programs  here 
can  be  made  very  efficient.  These  features  give  promise 
that  the  proposed  system  architecture  could  operate  in 
the  context  of  large  data  bases.  By  defining  the  tem¬ 
plates  carefully  the  MDS  system  can  be  specialized  to 
operate  efficiently  in  a  given  domain.  The  structure  of 
M ' '-S  is  described  in  the  next  section. 


hi.thf.mds 

BThe  block  diagram  of  MDS  is  shown  in  Fig.  3.  In  this 
figure  DI,(D),  T(I)),  and  K(D)  are,  respectively,  the 
definitions  of  descriptive  language,  templates  and 


Fig.  3.  Block  diagram  of  MDS:  *»  indicate  pointers  in  data  represen¬ 
tations,  «•*  indicate  data  and  control  flow  paths,  □  denote  data 
items,  and  O  denote  processors. 


knowledge  (CC’s  and  TR’s)  in  a  domain  D.  The  LIN¬ 
GUIST,  TEMPEST,  and  QUEST  are,  respectively,  the 
subsystems  that  accept  these  definitions  and  create  rep¬ 
resentations  for  them.  The  TEMPEST  and  QUEST  are 
now  working  systems.  The  checker  and  instantiator  are 
presently  under  construction. 

The  data  in  DL(D),  T(D),  and  K(D)  specialize  the 
MDS  for  the  domain.  The  rest  of  the  block  diagram  is 
self-explanatory. 


I  .  Concluding  Remarks 

We  have  uoduced  the  basic  concepts  of  Cl  systems 
and  MDS.  The  Cl  systems  provide  a  basis  for  the  defi¬ 
nition  of  the  concept  of  machine  understanding  in  terms 
of  models  that  a  machine  is  capable  of  building  in  a  do¬ 
main,  and  the  way  the  models  are  used.  The  under¬ 
standing  exhibited  at  the  problem  solving  level  of 
checker  is  relatively  simple  understanding.  A  deeper 
level  of  understanding  is  exhibited  in  the  kinds  of  prob¬ 
lems  that  the  TP  can  solve  (see  [10]).  At  the  level  of  de¬ 
signer  the  level  of  understanding  is  very  sophisticated. 
The  system  is  able  to  plan  and  build  procedures  to  solve 
problems. 

In  this  paper  we  have  discussed  only  a  part  of  the 
problem  solving  aspects  of  the  system;  the  workings  of 
the  checker  and  designer. 

We  are  proposing  the  use  of  DUD),  T(D),  and  K(D) 
to  transfer  domain  dependent  descriptive  knowledge  to 
a  computer.  We  have  briefly  indicated  how  such  de¬ 
scriptive  knowledge  could  be  used  to  solve  problems  in  a 
domain  automatically. 

The  specification  of  1)1, (D),  T(D),  and  K(I) )  in  a  do¬ 
main  will,  of  course,  require  a  very  good  understanding 
of  the  concepts  and  problems  in  a  domain.  There  arc 
several  domains  where,  at  present,  such  understanding 
is  available.  The  MDS  provides  a  way  of  transfering  this 
understanding  to  a  computer. 

There  is  much  work  to  he  done  to  mnkc  the  MDS  a 


402 


IKKK  TRANSACTIONS  ON  COMPUTI-atS,  YOU  0-25,  NO  I,  AIMUi.  1976 


viable  system.  It  is  necessary  to  develop  a  working  sys¬ 
tem  first.  We  are  presently  involved  in  this  task. 

References 

[1]  R.  R.  Kikes.  “RKK-AHK:  A  system  for  solving  problems  stated  as 
procedures,"  J.  Artificial  Intelligence,  vol.  1,  no.  t,  1970. 

[2|  - - ,  “A  heuristic  program  for  solving  protilems  stated  as  non- 

deterministic  procedures,"  Kh.D.  dissertation,  Dep.  Comput. 
Sci.,  Carnegie- Mellon  llniv..  Kit tslxirgh,  KA,  1908. 

[3|  J.  Derkson,  ,1.  K.  Rulifson,  and  R.  .1.  Waldinger,  “The  QA4  lan¬ 
guage  applied  to  robot  planning,”  in  1972  Fall  Joint  Comput. 
(’on/.,  AFIPS  Conf.  Free.,  vol.  41.  Montvale,  NJ:  AKIKS  Kress, 
1972,  pt.  II,  pp.  1181-1187. 

(4)  (1.  D.  Gibbons,  “Beyond  RRF-ARF:  Toward  an  intelligent  pro¬ 
cessor  for  a  nondeterministic  programming  language,"  I’h.D.  dis¬ 
sertation,  Dep.  Comput.  Sci.,  Carnegie-Mellon  Univ.,  Pittsburgh, 
KA,  1973. 

[/,]  R.  E.  Fikes  and  N.  J.  Nilsson,  “STRIPS:  A  new  approach  to  the 
application  of  theorem  proving  to  problem  solving,"  J.  Artificial 
Intelligence,  vol.  3,  no.  1,  pp.  27-68,  1972. 

[6|  R.  E.  Fikes,  A.  A.  Hart,  and  N.  J.  Nilsson,  “Learning  and  execut¬ 
ing  generalized  robot  plans,”  J.  Artificial  Intelligence,  vol.  3,  pp. 
251-288,  1972. 

[7]  C.  Hewitt,  “Description  and  theoretical  analysis  (using  schema¬ 
ta)  of  PLANNER:  A  language  for  proving  theorems  and  manipvij 
lating  models  in  a  robot,"  Kh.D.  dissertation,  Dep.  Mathematics, 
Mass.  Inst.  Teehnol.,  Cambridge,  1972. 

[8]  A.  Newell,  J.  D.  Shaw,  and  H.  A.  Simon,  “Report  on  a  general 
problem-solving  program  for  a  computer,”  in  Proc.  Int.  Conf.  In¬ 
formation  Processing,  UNESCO,  Paris,  France,  pp.  256-264; 
also,  reprinted  in  Comput.  Automation,  July  1959. 


|9|  S.  Amarel,  “On  representations  of  problems  of  reasoning  nlmut 
actions,"  in  Machine  Intelligence,  vol.  3,  D.  Michie,  Erl.  1‘xlin- 
burgh,  Scotland:  Edinburgh  Univ.  Kress,  19118,  pp.  131  170. 

1 10]  C.  V.  Srinivasan,  “Theorem  proving  in  the  meta  description  sys¬ 
tem,"  Dep.  Comput.  Sci.,  Rutgers  Univ.,  New  Brunswick,  N,l, 
Rep.  SOSAK-TR-20. 


Chitoor  V.  Srinivasan  (M’63)  was  born  in 
Cuddappah,  India,  on  November  6,  1933.  He 
received  the  B.S.  degree  from  Madras  Univer¬ 
sity,  Madras,  India,  in  1953,  the  D.M.I.T.  de¬ 
gree  in  electronics  from  the  Madras  Institute 
of  Technology,  Madras,  in  1956,  and  the  M.S. 
and  D.Eng.Sc.  degrees  in  electrical  engineering 
from  Columbia  University,  New  York,  NY, 
both  in  1963. 

From  1956  to  1959  he  was  at  the  Tata  Insti¬ 
tute  of  Fundamental  Research,  Bombay, 
India.  From  1962  to  1969  he  was  at  the  RCA  Laboratories,  Princeton,, 
NJ.  Presently  he  is  with  the  Department  of  Computer  Science,  Rut¬ 
gers  University,  New  Brunswick,  N,J. 

Dr.  Srinivasan  is  a  member  of  the  Association  for  Computing  Ma¬ 
chinery. 


PAS-II:  An  Interactive  Task-Free  Version  of  an  Automatic 

Protocol  Analysis  System 

DONALD.  A.  WATERMAN  AND  ALLEN  NEWELL,  FELLOW,  IEEE 


Abstract — PAS-II,  a  computer  program  which  represents  a 
generalized  version  of  an  automatic  protocol  system  (PAS-I)  is 
described.  PAS-II  is  a  task-free,  interactive,  modular  data  anal¬ 
ysis  system  for  inferring  the  information  processes  used  by  a 
human  from  his  verbal  behavior  while  solving  a  problem.  The 
output  of  the  program  is  a  problem  behavior  graph:  a  descrip¬ 
tion  of  the  subject’s  changing  knowledge  state  during  problem 
solving.  As  an  example  of  system  operation  the  PAS-II  analysis 
of  a  short  cryptarithmetic  protocol  is  presented. 

Index  Terms — Cryptarithmetic,  hypothesis  formation,  model 
building,  natural  language  processing,  problem  space,  produc¬ 
tion  system,  protocol  analysis. 

I.  Introduction 

AUTOMATIC  protocol  analysis  is  a  joint  effort  by 
ZjL  man  and  machine  to  infer  from  the  record  of  the 
time  course  of  a  subject’s  behavior,  the  underlying  in¬ 
formation  processes.  As  developed  [5],  it  usually  refers 

Manuscript  received  February  15,  1973.  This  work  was  supported  in 
part  by  the  National  Institutes  of  Health  under  (’.rant  Mil-07732  and 
in  part  by  the  Advanced  Research  Projects  Agency  of  the  Office  of  the 
Secretary  of  Defense,  which  is  monitored  by  the  Air  Force  Office  of 
Scientific  Research,  under  Grant  F44620-70-C-0I07. 

D.  A.  Waterman  is  with  the  Rand  Corporation,  Santa  Monica,  OA. 

A.  Newell  is  with  the  Department  of  Computer  Science,  Carnegie- 
Mellon  University,  Pittsburgh,  PA. 


to  the  verbalizations  of  a  subject  solving  some  problem 
under  .ustructions  to  think  out  loud.  Protocol  analysis 
designates  the  full  range  of  activities  engaged  in  by  the 
psychologist  when  working  with  protocols:  description 
of  the  subject’s  behavior  according  t<  an  hypothesized 
model,  induction  of  new  rules,  derivation  of  conse¬ 
quences  from  a  model  in  the  context  of  specific  data, 
and  measurement  of  adequacy  of  a  model.  The  initial 
focus  of  our  work  has  been  behavior  description  in 
terms  of  information  processes,  given  an  hypothesized 
general  model  (the  so-called  problem  space  in  which  the 
subject  operates). 

The  PAS-I  system  [14],  [15]  was  our  first  attempt  at 
automatic  protocol  analysis.  This  is  a  fully  automatic, 
noninteractive,  specialized  system  designed  to  analyze 
cryptarithmetic  protocols  and  produce  as  output  a 
problem  behavior  graph  (PRO)  describing  the  subject’s 
search  through  a  posited  problem  space.  The  protocol 
analysis  is  represented  as  a  sequence  of  processing 
stages  that  eventually  transform  the  raw  protocol  into  a 
problem  behavior  graph.  At  each  stage  rules  arc  applied 
which  effect  a  transformation  of  the  data.  The  organize 
lion  of  PAS-I  is  shown  in  Pig.  1. 


PROGRAMMING  OVER  A  KNOWLEDGE  BASE:  THE  BASIS  FOR  AUTOMATIC  PROGRAMMING 


C.V.  Srinivasan 

Department  of  Computer  Science,  Hill  Center 
Rutgers  University,  New  Brunswick,  N..I. 

0890.1 

Abstract :  This  caper  introduces  the  notion  of  using  a  highly  flexible 

general  problem  solving  system  as  the  basis  for  developing  domain  depen¬ 
dent  automatic  programming  systems,  that  can  actively  and  intelligently 
assist  its  users  to  formulate  problems  and  develop  programs  in  the  domain. 

The  system  is  called  the  Meta  Description  System.  It  is  being  currently 
implemented  in  LISP  1.6.  The  system  accepts  definitions  of  description 
schemas  for  describing  KNOWLEDGE  in  a  domain,  and  uses  these  schemas  to 
specialize  itself  as  an  efficient  problem  solver  in  the  domain.  It  also 
has  the  capability  to  accept  definitions  of  a  language  of  discourse  for 
a  domain  and  have  the  users  communicate  with  it  in  the  specified  language. 

1.  Introduction 

Our  objective  is  to  create  an  automatic  programming  (AP)  system  that  can 
actively  and  intelligently  assist  users  to  solve  problems  in  a  domain.  A  domain 
might  be  as  complex  as  the  design  of  a  computing  system,  or  it  might  be  the 
diagnosis  and  treatment  of  a  disease  system;  it  might  be  a  piece  of  mathematics 
or  psychology.  Our  mode  of  operation  will  be  the  following: 

Suppose  one  wanted  to  create  an  automatic  programming  system  for  a  domain  D. 
Then  one  would  first  specify  to  our  system  some  core  knowledge  in  the  domain  D. 
This  would  consist  of  schemas  specifying  how  objects  in  the  domain  D  are  des¬ 
cribed  and  their  descriptions  represented  in  the  data  base  (these  schemas  are 
called  description  schemas);  descriptions  of  specific  objects  in  the  domain 
satisfying  the  given  schemas  (we  shall  refer  to  these  as  instances  (cr  models) 
of  objects  in  the  domain);  rules  specifying  how  given  descriptions  of  objects 
in  a  domain  may  be  transformed  to  new  ones  satisfying  given  criteria;  and  pos¬ 
sibly  also  strategies  for  problem  solving  in  the  domain  D.  These  specifications 
of  knowledge  in  the  domain  will  cause  the  system  to  create  a  data  base,  called 
the  Coherent  Data  Base  for  the  domain  D,  CDB(D).  The  system  will,  of  course, 
assist  the  user  in  setting  up  the  CDB(D),  by  looking  for  inconsistencies ,  see¬ 
king  out  missing  information,  and  where  necessary  itself  supplying  the  missing 
information.  The  CDB(D)  constitutes  a  knowledge  base  over  which  all  domain 
dependent  programming  in  the  domain  D  will  take  place.  As  the  system  is  used, 
its  knowledge  base  will  continue  to  expand.  The  system  itself  will  use  this 
knowledge  base  automatically,  to  intelligently  assist  its  users  to  solve  problems 
in  the  domain  D. 

*  This  work  was  supported  by  gTant  #:  DAHCIS-7.1-G-6 ,  from  ARPA. 


2 


The  system  is  called  the  meta-description  system  (MDS)*.  It  is  a  meta 
system  in  the  sense  that  it  accepts  definitions  of  description  schemas  (in 
terms  of  devices  called  templates  and  sense  definitions) ,  for  a  domain  D,  and 
uses  these  schemas  to  specialize  itself  in  an  active  way  to  solve  problems 
efficiently  in  domain  D.  This  specialization  occurs  in  three  ways: 

(i)  In  the  data  structures  used  to  represent  descriptions  of  (models 
of)  objects  in  domain  D, 

(ii)  In  the  problems  solving  control  structures  used  for  the  domain  D, 


and 


I 

I 

l 


(iii)  In  the  way  problem  solving  experiences  in  the  domain  are  summarized 
and  later  automatically  used  for  self-improvement. 

The  architecture  of  the  MDS  allows  for  fundamental  structural  changes  to 
take  place  in  the  system,  to  efficiently  utilize  the  available  domain  dependent 
knowledge.  The  MDS  is  thus  a  general  problem  solving  system  that  can  specialize 
itself  to  perform  efficiently  in  a  given  donuin.  At  every  point  in  its  opera¬ 
tion  the  MDS  can  automatically  make  full  use  of  its  knowledge  base  to  actively 
and  intelligently  assist  its  users.  We  shall  refer  to  programming  in  the  £ 

context  of  the  MDS  as  programming  over  a  knowledge  base. 

There  are  several  new  concepts  in  the  architecture  of  the  MDS.  Usually 
general  problem  solving  systems  have  a  way  of  imposing  their  own  will  on 
everything  around  them.  They  would  demand  that  data  be  represented  in  certain 
ways,  they  might  demand  that  problems  be  stated  only  in  certain  ways,  and  they 
often  resist  strongly  interference  with  their  problem  solving  procedures- -do 
not  take  to  advice  easily.  The  limitations  caused  by  these  were  well  recog¬ 
nized  early  in  the  game.  The  trend  towards  the  development  of  programming 
languages  like  PLANNER  [Hewitt  (1972)]  and  CONNIVER  [McDermott  (1973)]  was, 
in  fact,  a  response  to  overcome  this  limitation.  These  language  systems  do 
not  fix'  a_  priori  any  problem  solving  scheme.  They  let  the  designer  specify 
schemes  and  strategies  for  given  domains.  In  doing  this,  however,  they  do 
not  provide  any  automatic  and  intelligent  problem  solving  help  to  the  users. 

It  is  the  programmers'  responsibility  to  specify  and  develop  all  the  problem 


‘Since  1971  August,  our  work  on  the  MDS  has  been  partially  supported  by  a 
grant  from  NIH,  Grant  No.  RR643. 


‘g  1 

ge 

th- 


d 


»)M>; 


[«■*] 


The  general  problem  solvers  in  the  MDS  are  very  flexible  and  obliging 
ones:  They  do  not  demand  that  data  be  represented  one  way  or  another,  and 
also  more  importantly  they  do  not  impose  any  a  priori  chosen  search  strategy 
for  problem  solving.  Representations  for  the  descriptions  (models)  of  objects 
in  a  domain  D,  follow  the  dictates  of  the  description  schemas  for  the  domain, 
and  not  the  dictates  of  the  problem  solvers.  More  importantly,  the  problem 
solving  protocols--summaries  of  the  system's  problem  solving  experiences--may 
themselves  be  treated  as  objects  in  the  domain  D,  with  their  own  associated 
description  schemas.  We  shall  refer  to  these  as  the  problem  solving  schemas 
for  domain  D,  Pr'S(D).  PSS(D)  will  again  be  specified  in  terms  of  devices 
called  template  s  and  sense  definitions.  Fot  a  domain  D,  its  PSS(D)  will 


specify  the  problem  solving  control  structures  and  search  strategies.  In¬ 


stances  of  PSS(D)  (models  of  problem  solving  experiences)  may  then  be  used  by 
the  problem  solvers  in  the  same  way  as  any  other  data  in  the  Coherent  Data 
Base.  The  PSS(D)  may  be  so  defined  that  the  problem  solver  improves  itself 
by  using  the  models  of  prior  experiences.  How  is  this  all  done?  The  full 
answer  to  this  question  is  necessarily  a  complex  one.  We  shall  here  illustrate 
the  operation  of  the  MDS  with  a  small  example,  chosen  from  Balzer's  paper 
[Balzer  (1973)].  We^shall  use  this  example  to  introduce  the  basic  conventions 
of  the  MDS,  its  operational  characteristics,  its  logical  processes,  and  to 
show  how  it  does  problem  solving.  Later,  in  section  3  we  shall  comment 
further  on  the  MDS  -and  compare  it  with  other  works  in  Automatic  Programming, 
to  place  it  in  perspective  with  the  other  works. 

We  are  at  present  in  the  early  stages  of  implementation  of  MDS.  We 
expect  to  complete  the  implementation  of  all  of  its  facilities  in  about  two 
years.  The  data  base  management  part  of  the  MDS  is  expected  to  be  ready  in 
the  Spring  of  1974.  The  implementation,  so  far,  has  been  in  LISP1.6.  We 
expect  to  convert  the  existing  system  to  INTER-LISP  and  continue  further  work 
in  a  TENEX  system. 


2.1)  Specification  of  the  Description  Schema 

Briefly,  the  problem  is  the  following: 

"A  PERSON  is  HAPPY  if  he/she  has  a  COMPATIBLE  MARRIAGE,  or  is  RICH.  A 
MARRIAGE  is  COMPATIBLE  if  the  COUPLE  has  a  common  hobby,  and  the  wife  is  not 
more  than  5  YEARS  older  than  the  husband .  A  PERSON  is  RICH  if  he/she  is  worth 
more  than  a  million  DOLLARS.  Make  JOHN  happy." 

The  words  in  capitals  in  the  above  statement  are  the  objects  of  the  do¬ 
main  of  this  problem,  which  we  shall  now  describe  to  our  system.  The  under¬ 
lined  words  will  appear  as  relation  names  in  our  system.  We  begin  by  telling 
the  system  how  to  describe  PERSON,  HAPPY,  COMPATIBLE,  MARRIAGE,  RICH,  COUPLE, 
YEAR  and  DOLLAR.  In  effect  we  shall  say  that  a  PERSON  is  an  individual  (node , 
in  contrast  to  a  list  or  tuple)  with  a  name,  who  has  an  age,  some  hobbies ,a 
worth ,  some  attribute,  a  sex,  an  emotion,  a  marriage,  and  may  have  a  wife, 
or  a  husband  (spouse) .  The  template  for  this  is  shown  below.  The  flag  RN 
associated  with  the  PERSON  template  indicates  that  a  PERSON  template 
is  a  regular  node,  i.e.  a  node  with  a  name.  Thus,  every  instance  of  PERSON 
will  have  a  name  in  the  CDB. 

((PERSON  RN) 

(age  (YEARS  TI)  )  (Worth  (DOLLARS  TI)  ) 

(hobbies  (HOBBIES  $L)  )  (loves  (PEOPLE  $L)  ) 

(marriage  (MARRIAGE  $N)  )  (sex  (SEX  RN)  ) 

(attributes  (ATBTL  $L) ,  SENSEI) 

((spouse  $)  PERSON,  SENSE2) 

((wife  $)  PERSON,  SENSE3) 

((husband  $)  PERSON,  SENSE4) 

(emotion  (EMOTION  RN) ,  SENSE5)) 


Let  us  follow  the  other  definitions  in  the  PERSON  template.  PERSON  calls 
other  templates  like  YEARS,  MARRIAGE,  etc.  via  relation  names  like  age,  hobbies, 
etc.  YEARS  has  been  declared  as  a  Termainal  Integer  (TI)  template.  Every  in¬ 
stance  of  YEARS  is  an  integer  with  dimension  YEARS.  Similarly,  DOLLARS  is  al¬ 
so  a  terminal  integer.  Notice  that  in  the  CDB,  two  integers,  say  8  YEARS  and 
1000  DOLLARS,  will  be  recorded  as  objects  with  different  dimensions .  HOBBIES 
is  a  dummy  list  ($L)  template.  Every  instance  of  HOBBIES  is  a  list  (say,  a 


3 


a  list  of  ACTIVITIES),  and  not  every  instance  of  HOBBIES  need  have  a  name  in 
the  CDB.  MARRIAGE  is,  similarly,  a  dummy  node  ($N)  template.  Not  every  ins¬ 
tance  of  MARRIAGE  will  have  a  name  in  the  CDB.  The  attributes  of  a  PERSON 
should  be  an  instance  of  (ATBTL  $L) ,  and  so  also,  a  PERSON'S  emotion  should 
be  an  instance  of  (EMOTION  RN) ,  and  the  sex  of  a  PERSON  is  an  instance  of 
(SEX  RN) .  We  shall  choose  not  to  associate  any  relation  names  with  the 
EMOTION  and  SEX  templates.  Since  b«th  of  these  are  regular  templates,  ins¬ 
tances'  of  these  in  the  CDB  will  just  be  descriptive  names  like  SAD,  HAPPY, 
and  MALE,  FEMALE  etc. 

The  relations  spouse,  wife  and  husband  are  defined  by  the  sense  defi¬ 
nitions  SENSE2,  SENSE3  and  SENSE4.  The  $  flag  associated  with  these  relations 
indicates  that  their  values  can  always  be  computed  from  the  sense  definitions, 
and  thus  iieed  not  be  stored  in  the  CDB  for  any  instance  of  PERSON.  The  EMO¬ 
TION  of  a  PERSON  is  defined  by  SENSE5,  but  we  require  that  its  value  be  stored 
in  the  CDB  for  Every  PERSON.  We  shall  later  see  how  the  sense  definitions  are 
specified.  Let  us  now  complete  the  definitions  of  the  other  templates. 

((MARRIAGE  $N) (partners  (COUPLE  $L)  SENSE6) (quality  (MQUAL  RN)  SENSE7)) 
((COUPLE  $L) (elem  (2  PERSON)) 

((ATBTL  $L) (elem  (ATBT  RN)) 

((HOBBIES  $L) (elem  (ACTIVITIES  RN)) 

((PEOPLE  $L) (elem  PERSON)) 

COUPLE  is  constrained  to  be  a  list  of  exactly  two  PERSONS.  HOBBIES 
is  a  list  of  an  arbitrary  number  of  ACTIVITIES,  where  each  ACTIVITIES  is  a 
regular  node.  The  above  templates  define  the  description  structure  of  objects 
in  the  domain  of  our  problem.  Let  us  now  create  some  of  the  descriptive  names. 

In  the  commands  below  "IT"  stands  for  "Instantiate  Template",  and  "DR"  stands 
for  "Delete  Relation".  In  the  CDB,  "?"  denotes  an  unknown.  A  "?"  in  a  list 
indicates  that  the  list  may  contain  additional  elements.  The  following  ins¬ 
tantiations  are  now  done: 

IT (SEX  MALE)  IT(SEX  FEMALE) 

IT (EMOTION  HAPPY)  IT (EMOTION  SAD)  IT (EMOTION  BLAH) 

IT (ACTIVITIES  GARDENING)  IT (ACTIVITIES  PROGRAMMING) 

IT (ATBT  RICH)  IT (ATBT  POOR)  IT (ATBT  ORDINARY)  IT (ATBT  THIEF) 

IT (MQUAL  COMPATIBLE)  IT (MQUAL  LOUSY) 

For  SEX  the  CDB  will  now  have:  (SEX  instance  (MALE  FEMALE  ?)), 


where  "instance"  is  a  system  relation,  and  (MALE  FEMALE  ?)  is  a  list.  If  SEX 
is  constrained  to  have  only  two  instances,  we  may  now  indicate  this  by  simply 
removing  the  ?  from  the  (MALE  FEMALE  ?)  list.  In  the  CDB,  new  elements  may 
be  introduced  in  a  list  or  a  set  only  if  the  list  or  set  contains  a  ?.  So, 
we  may  now  issue  DR(SEX  instance  ?) .  Just  to  see  what  happens,  let  us  now 
also  instantiate  a  PERSON,  called  JOHN:  IT(PERSON  JOHN).  This  would,  of  course 
cause  (PERSON  instance  (JOHN  ?))  to  be  created  in  the  CDB,  and  JOHN  itself  will 
have  the  following  structure  associated  with  it. 


The  model  of  JOHN  is  a  tuple  consisting  of  pointers,  defining  the  various 
relations  associated  with  JOHN.  The  name  of  the  model  is  JOHN,  it  is  an 
instance  of  PERSON,  and  it  points  to  JOHN'S  age,  worth ,  hobbies,  etc.  The 
pointers  appear  in  the  model  in  the  same  order  as  their  associated  relations 
appear  in  the  template  PERSON.  Relation  symbols  in  the  template  with  $  flags 
do  not  have  associated  pointers  in  the  model.  Initially  all  the  unknown  rela¬ 
tion  values  are  set  to  ?.  We  have  now  created  JOHN  about  whom  we  know  nothing, 
except  that  JOHN  is  an  instance  of  PERSON.  In  the  CDB  if  (x  r  (yj  y2  • • • 
is  true  (i.e.  the  model  of  x  points  to  the  list  (yj  y2  . . .  yn)  for  the  rela¬ 
tion  symbol  r)  then  it  is  interpreted  as  (x  r  y^ (x  r  y2) . . . (x  r  yn) .  Thus 
(JOHN  attributes  (RJCH  THIEF))  would  mean  (JOHN  attributes  RICH)  and  (JOHN 
attributes  THIEF)  are  both  true.  Also,  for  every  (x  r  y)  in  the  CDB,  the  CDB 
will  also  contain  (y  r*-  x) ,  where  r*-  is  the  inverse  of  r.  That  is,  if  the 
model  of  x  points  to  y  for  the  relation  symbol  r,  then  the  model  of  y  will 
point  back  to  x  for  the  relation  symbol  r«-.  Let  us  now  take  a  look  at  the 
sense  definitions.  SENSEI  is  given  below.  It  is  associated  with  (PERSON 
attributes).  In  the  definition  of  SENSEI  read  *!  as  the  "current  instance 
of  PERSON'S",  and  read  "(x  is  RICH)"  as  "(x  EQ  RICH)"  ("is"  has  the  status 
of  EQ  in  LISP).  We  shall  also  use  "is"  together  with  relation  symbols  to 
improve  readability  wherever  convenient. 


SENSEI:  (PERSON  attributes) 


( (ATBT  x)  | ( (x  is  RICH)  <=>  (*!  worth  is. £  1000000)) 

((x  is  POOR)  <^>  (*!  worth  is. 5  1000)) 

((x  is  ORDINARY)  <=>  ^(x  is  RICH)  Mx  is  POOR)) 

((*!  attributes  THIEF)  =>  (x  is  THIEF))) 

Here  ((*!  attributes  THIEF)  =>  (x  is  THIEF))  is  interpreted  as  "If 
THIEF  is  declared  to  be  an  attribute  of  *!  then  (x  is  THIEF)."  Thus,  to  set 
up  (JOHN  attributes  THIEF)  someone  should  declare  I R (JOHN  attributes  THIEF), 
where  "IR"  stands  for  "Instantiate  Relation".  Let  us  refer  to  this  sense  defi¬ 
nition  by  SENSEI (*!,  x)  indicating  that  it  has  two  arguments:  One  is  the 
current  instance  of  PERSON  at  which  it  is  being  evaluated,  and  the  other,  x, 
is  the  attribute  for  which  it  is  desired  to  know  whether  (*!  attribute  x)  is 
true  or  not.  Every  sense  definition  is  thus  a  function  of  exactly  two  argu¬ 
ments,  one  of  which  is  always  *!.  *!  is  called  the  anchor  of  a  sense  defi¬ 

nition. 

If  SENSEI (*!)  is  issued  then  the  system  *ill  attempt  to  find  all  the 
attributes  of  *!  that  satisfy  SENSEI.  If  none  could  be  found  then  it  will 
return  ?.  In  the  evaluation  of  SENSEI (*!),  (*!  attributes  THIEF)  will  be  true 
if  it  i!s  so  indicated,  already  in  the  data  base.  In  the  evaluation  of 
SENSE(*!,  x) ,  (*!  attributes  THIEF)  is  true  if  x  is  bound  to  THIEF.  Thus,  in 
the  definition  of  SENSEI,  (*!  attributes  THIEF)  has  a  special  status,  since 
the  SENSEI  itself  defines  (*l  attributes). 

As  the  reader  might  have  already  guessed  the  sense  definitions  are  evalu¬ 
ated  over  a  three  valued  logic  system,  T,  ?  and  NIL;  T  dominates  ?,  and  ? 
dominates  NIL.  The  other  sense  definitions  are  shown  below: 

SENSE2:  (PERSON  spouse):  ((PERSON  x)  |  •'»(*!  is  x)(*I  marriage. partners  x)) 

This  definition  also  says  "the  list  of  ALL  PERSONS  x  such  that  ...", 
but  the  MDS  will  interpret  this  as  "THE  PERSON  x  such  that  ...",  because  the 
template  for  PERSON  says  that  the  spouse  of  a  PERSON  is  a  PERSON  and  not  a 
list  of  PERSONS.  The  spouse  of  a  PERSON  is  distinct  from  the  PERSON  and  is 
the  PERSON’S  "marriage. partners".  The  here  indicates  concatenation  of 
relations.  It  corresponds  to  a  relation  path  in  the  CDB. 

SENSES:  (PERSON  wife):  ((PERSON  x)  |  (x  sex  is  FEMALE) (*!  spouse  x)) 

SENSE4:  (PERSON  husband):  ((PERSON  x)  |  (x  sex  is  MALE)(*!  spouse  x)) 


s 


SENSE5  (PERSON  emotion):  ((EMOTION  x)  | 

((x  is  HAPPY)  <=>  ((*!  attributes  RICH)  V  (*!  marriage. quality  COMPATIBLE)) 

''-(‘I  attributes  THIEF)) 

((x  is  SA!J)  <s=>  (*!  attributes  POOR)  V  (*!  marriage. quality  LOUSY)) 

((x  is  BLAH)  <=>  %(*!  emotion  HAPPY)<v(*J  emotion  SAD))) 

SENSE6  (MARRIAGE  partners):  ((PERSON  x  y)  i  (x  sex  is  MALE) (y  sex  is  FEMALE) 

(x  loves  y) (y  loves  x) 

(*!  marriage  of  x) (* !  marriage  of  y)) 

Notice  that  in  SENSE6  *!  stands  for  "the  current  instance  of  MARRIAGE",  and 
marriage  of  is  usea  as  the  inverse  of  marriage.  Notice  also  that  the  PERSON 
template  specifies  that  the  marriage  of  a  PERSON  is  unique,  since  MARRIAGE  is 
a  node  template.  This  precludes  a  PERSON  from  having  more  than  one  marriage. 

SENSE7  (MARRIAGE  quality):  ( (MQUAL  x)  |  (SOME  PERSON  y) (SOME  ACTIVITIES  z) 

((x  is  COMPATIBLE)  <=>  (y  spouse. age.  <  (PLUS  (y  age)  5))(y  hobbies  z) 

(y  spouse. hobbies  z)) 

((x  is  LOUSY)  <=>  *(*!  quality  COMPATIBLE))) 

This  completes  the  definition  of  the  description  schema  for  che  domain  of 
our  example.  The  system  uses  the  sense  definitions  to  keep  track  of  the  inter¬ 
actions  among  the  various  relations.  Thus,  (PERSON  sex)  is  used  in  the  defini¬ 
tion  of  (PERSON  wife),  (PERSON  husband)  and  (MARRIAGE  partners).  The  MDS  will, 
therefore,  set  up 

DETL (PERSON  sex)  =  ((PERSON  wife)  (PERSON  husband)  (MARRIAGE  partners)) 

It  does  seem  reasonable  that  a  PERSON'S  sex  should  determine  the  PERSON'S 
wife,  husband  and  MARRIAGE  partners.  The  DETL's'  associated  with  the  various 
(template,  relation)  pairs  in  our  example  are  shown  below: 

DETL (PERSON  age)  =  ((MARRIAGE  quality)) 

DETL (PERSON  worth)  =  ((PERSON  attributes)) 

DETL (PERSON  hobbies)  =  ((MARRIAGE  quality)) 

DETL (PERSON  loves)  =  ((MARRIAGE  partners)) 

DETL(PERSON  marriage)  =  ((MARRIAGE  partners) (PERSON  spouse) (PERSON  emotion) 

(MARRIAGE  quality)) 

DETL (PERSON  sex)  =  ((PERSON  marriage) (PERSON  wife) (PERSON  husband)) 

DETL (PERSON  attributes)  =  ((PERSON  emotion)) 

DETL (PERSON  spouse)  =  ((PERSON  wife) (PERSON  husband) (MARRIAGE  quality)) 

DETL (PERSON  emotion)  =  NIL. 

Suppose  we  now  wanted  to  say  the  following:  "If  you  wanted  to  make  a  PERSON 
RICH  then  make  him  rob  a  BANK  for  1000000  dollars.  Also,  if  you  succeed  in 
doing  this  then  make  the  PERSON  a  THIEF."  Let  us  assume  that  the  template 


for  BANK  already  exists,  and  also  a  function  of  two  arguments,  called  ROBBANK 
has  been  already  defined.  The  above  procedure  may  then  be  declared  to  the  MDS 
as  a  transformation  rule  ,  as  follows: 

TR1 :  ((PERSON  x)(GOAL(x  attributes  RICH))  (((SOME  BANK  B) (ROBBANK  x  B) 

(ASSERT (x  worth  is  1000000)) (IFDON  (ASSERT(x  attributes  THIEF))))). 

The  IFDON  clause  is  activated  only  if  both  ROBBANK  and  ASSERT  are  successfully 
completed.  The  entire  function  is  said  to  be  successful  if  the  IFDON  clause 
completes  successfully.  Transformation  rules  like  this  operate  in  a  back¬ 
tracking  environment.  In  response  to  (GOAL (JOHN  attributes  RICH))  the  MDS  will 
invoke  TR1,  if  JOHN  is  not  already  RICH  in  the  CDB.  Our  objective  is  now  to 
make  (GOAL (JOHN  emotion  HAPPY)).  Before  we  see  what  might  happen  in  response 
to  this  command,  let  us  first  consider  how  the  description  schema  so  far  given 
is  used  by  the  MDS  to  establish  and  control  the  CDB  for  the  domain,  and  how  the 
CDB  is  itself  u;  jd  for  problem  solving. 

2.  The  Data  Management  System  and  the  Problem  Solvers. 

MDS  has  a  hierarchy  of  three  problem  solvers:  CHECKER- INST ANTI ATOR  (CHIN) 
THEOREM  PROVER  (TMPR)  and  DESIGNER.  The  CHECKER  evaluates  sense  definitions  in 
three  valued  logic  system,  and  the  INSTANTIATOR  sets  up,  updates,  deletes  and 
retrieves  data  in  the  CDB  in  accordance  with  the  rules  specified  by  the  temp¬ 
lates.  These  two  together  constitute  the  data  management  system  of  the  MDS. 

The  CHECKER  evaluates  sense  definitions  always  modulo  the  objects  in  the 
CDB.  Thus  all  the  quantifiers  in  a  sense  definition  become  bounded  quantifiers 
Since  each  sense  definition  has  an  anchor,  *!,  the  CHECKER  uses  it  to  begin  its 
search  over  the  data  base.  In  fact  the  sense  definitions  may  be  written  care¬ 
fully  to  make  this  search  efficient.  There  are  basically  two  kinds  of  sense 

definitions:  imperative  ones  and  declarative  ones.  Let  ST  _(*!,x)  be  the 

~ ~ ~ ~ ~ — — — —  i  ,r 

sense  definition  associated  with  template  T  and  relation  symbol  r.  Then 

ST  r(*!,x)  is  imperative  if  (*!  r  x)  <=>  ST  f(*!,x).  Imperative  ST  ^'s  may  be 

used  to  find  {x | (* !  r  x)  }  .  If  (*!  r  x)  =>  S'  (*!,x),  then  S'  cannot  be 
.  *  * >r 
used  to  find  {xj(*!  r  x)}.  It  can,  however,  be  used  to  find  a  superset  of  the 

relation  r,  or  given  a  (*!  r  x)  it  can  be  used  to  find  out  whether  it  is  TRUE, 

?  or  NIL.  To  force  the  CHECKER  to  look  for  this  declared  x,  we  shall  write 

declarative  definitions  of  this  kind  as:  S_  (*!,x)  =  (*!  r  x)S~  (*!,x). 

1 1 r  ^ 

In  this  case  S_  (*!)  is  either  ?  or  whatever  is  stored  in  the  CDB. 
i  .r 


Besides  returning  the  truth  value  of  a  S  the  CHECKER  also  will  return 

i  j 

certain  subexpressions  of  ST  _,  called  residues.  Let  a  be  a  particular  anchor 

*  »  ^  * 

If  _(a,x)  =  T,  then  the  true  residue  of  S.n  _(a,x)  is  the  part  of  ST  _(a, 
*  »r  — — — _  l » r  •  *  >  f 

that  caused  it  to  be  true  (the  support  of  the  condition).  Similarly,  if  the 

condition  is  ?,  then  the  residue  is  the  part  of  the  condition  that  evaluated 

to  ?.  And,  if  _(o,x)  is  NIL  then  the  false  residue  of  _(a,x)  will  be 
i , r  1 

the  part  of  it  that  evaluated  to  NIL.  Let  us  consider  a  small  example. 

Let  P  =  (xj  v  X2)(^x^v  x3) .  Then  for  various  valuations  of  Xj,  X2  an 
x,  the  various  residues  would  be  as  shown  below: 


valuation  <{> 

X1  x2  X3 

T 

T 

T 

T 

T 

? 

T 

T 

NIL 

T 

? 

T 

? 

? 

T 

NIL 

9 

? 

NIL 

NIL 

T 

(*1  v  x2)x3 
9 


T 

T 

(xivx2) 

x2 

NIL 


FVp) 


(^Xj  v  x3) 
T 
? 

? 

(XI  V  x2) 


These  residues  are  used  in  various  ways  in  problem  solving.  The  residues 
are  used  by  the  TMPR  to  construct  new  objects  that  satisfy  given  conditions. 
Both  the  residues  and  false  residues  are  together  (called  the  Not  True  Part) 
used  by  the  DESIGNER  for  means-end  analysis.  All  the  residues  are  used  by  the 
CHECKER  to  speed  up  the  data  base  updating  process:  If  (x  r  y)  is  to  be 
changed  to  (x  r  z)  then  the  CHECKER  will  check  all  the  residues  associated 
with  every  (Y^,  r^)  in  the  DETL(x,r).  Only  if  the  residue  changes  value 
[i.e.  a  true  residue  or  a  residue  evaluates  to  NIL,  or  a  false  residue 
evaluates  to  T)  should  the  CHECKER  evaluate  the  parent  sense  definition.  In  a 


*  TR^(P)  is  the  True  residue  of  P  for  valuation  <f>  ,  and  similarly  we  have 
FRA(P)  and  RA(P). 


problem  solving  process  the  residues  are  also  used  to  summarize  the  problem 
solving  experience:  If  an  action  succeeded,  the  associated  true  residues 
will  then  explain  the  reasons  for  success,  if  it  failed  then  the  associated 
false  residues  say  why  the  failure  occurred.  Summaries  of  these  residues  may 
then  be  used,  with  appropriate  generalizations,  for  guiding  the  problem  solver 
subsequently  when  "similar"  problem  solving  situations  arise.  We  shall  briefly 
see  the  use  of  residues  in  the  discussion  of  our  example  in  section  3. 

The  INSTANTIATOR  will  complete  an  IR(x  r  y)  [Instantiate  Relation]  com¬ 
mand  only  if  no  contradiction  arises  in  JL  (x  z),  and  among  all  the  condit 

X ,  r 

associated  with  the  DETL(x,r).  There  is  also  an  IRN(x  r  y)  command,  which 
will  set  ~(x  r  y)  true  in  CDB,  if  possible.  Corresponding  to  IR  and  IRN  we 
also  have  DR  and  DRN  (D  for  Delete]  and  JR  and  JRN  ( J  for  Justify).  Wherea 
an  IR  command  will  not  accept  (x  r  y)  if  a  contradiction  arose  in  DETL(x,r), 

JR  (and  similarly  JRN)  will  attempt  to  modify  the  relations  in  DETL(x,r) 
appropriately  (if  possible)  and  thus  attempt  to  justify  the  given  (x  r  y). 

The  sense  definitions  act  as  the  gate  keepers  of  the  CDB,  making  sure 
that  nothing  illagal  happens.  The  CDB  is  thus  always  kept  contradiction  free. 
However,  as  discussed  in  Srinivasan  1973b,  because  of  the  three  valued  logical 
system,  there  might  exist  hidden  contradictions  (contradictions*-arising  be¬ 
cause  of  incomplete  knowledge)  in  the  CDB.  An  assertion  in  a  domain  is  true, 
if  and  only  if  models  can  be  built  in  the  CDB  to  satisfy  the  assertion. 

This  feature  of  the  CDB  is  used  by  TMPR  (as  discussed  in  Srinivasan  1973b) 
to  find  proofs  of  assertions  in  a  domain.  The  TMPR  provides  the  control  struc¬ 
ture  to  the  CHIN  system,  to  direct  it  appropriately,  to  build  models  to 
satisfy  an  assertion,  if  such  models  are  possible.  If  models  do  not  exist 
then  it  will  discover  a  contradiction.  In  the  model  building  process  the 
TMPR  uses  the  residues  generated  by  the  CHIN  system,  to  guide  itself.  The 
theorem  proving  process  in  TMPR  develops  proofs  by  synthesis.  It  introduces 
a  new  approach  to  theorem  proving. 

The  DESIGNER  is  used  to  do  means-end  analysis,  to  invoke  the  appropriate 
transformation  rules,  like  (TR1),  to  reach  a  goal,  and  to  interpret  the  trans¬ 
formations.  The  DESIGNER  may  use  the  CHIN  and  TMPR  systems  to  find  (or  build) 
the  appropriate  objects  in  the  CDB  to  accomplish  a  given  task. 

To  do  intelligent  problem  solving  in  a  domain  both  the  TMPR  and  DESIGNER 
should  be  able  to  appropriately  summarize  a  problem  state,  and  their  own 
past  experiences,  and  use  these  effectively  to  search  the  solution  space  (the 
goal-subgoal  tree)  for  a  given  problem.  For  a  given  domain,  the  description 


schemas  for  describing  the  states  of  the  DESIGNER  and  TMPR  may  themselves  be 
again  specified  by  templates  and  sense  definitions.  Let  DS  (Designer  State) 
and  TPS  (Theorem  Prover  State)  be  the  templates  with  associated  sense  defi¬ 
nitions,  that  specify  the  respective  states  of  the  problem  solvers  for  a  domain 
D.  Every  time  the  DESIGNER  invokes  a  function  (note  that  the  DESIGNER  can 
invoke  the  TMPR  itself  as  a  function)  it  will  create  a  new  instance  of  DS  to 
describe  the  problem  state  associated  with  the  invoked  function.  The  DS  itself 
might  be  as  follows: 

((DS  $N) 

(fn-called  FND) :  Some  unique  way  of  identifying  the  called  function 

(initl-state  ICOND,  SENSEI)  Some  way  of  specifying  the  initial  state 
in  CDB,  and  possibly  other  problem  conditions  that  caused  the  invocation  of 
the  function. 

(bindings  BNDGS,  SENSEB) :  The  list  of  all  possible  bindings  available  in 
the  CDB  for  the  arguments  of  the  invoked  function  (actually  its  closure),  for 
the  current  invocation.  In  general,  there  might  be  more  then  one  possible 
binding.  Let  u:.  assume  that  BNDGS  also  flags  the  currently  chosen  bindings. 
SENSEB  might  specify  how  to  choose  the  current  binding. 

(subgoals  SUBGLS  ,SENSEG):  The  list  of  DS-instances  corresponding  to 
all  available  subgoals,  if  such  subgoals  exist.  SENSEG  might  specify  how  to 
choose  one  from  among  the  list. 

(sensesummaries  SSM,  SENSES):  Summaries  (usually  made  out  of  the 
residues)  of  all  sense  definitions  evaluated  during  the  active  tenure  (i.e. 
before  the  DS-instance  is  closed  up  as  having  been  successful  or  a  failure) 
of  the  DS-instance.  All  these  summaries  will  appear  in  terms  of  the  variables 
appearing  in  the  invoked  function.  The  template  SSM  might  itself  be  domain 
dependent.  The  sense  definitions  associated  with  the  SSM  template  might 
provide  ways  of  analyzing  and  summarizing  all  the  residues  obtained  in  a  DS- 
instance.  One  may  include  in  these  sense  definitions,  domain  specific  evalu¬ 
ation  functions,  if  any. 

The  sense  summaries  will  generally  fall  into  two  classes:  Those  associated 
with  the  successful  completion  of  the  invoked  function  (SUCSSM)  and  those 
associated  with  its  failure  or  suspension  of  the  function  (FAILSSM).  Notice 


that  for  each  DS-instance,  its  associated  sense  summaries  would  specify  the 
special  cases  in  which  the  invoked  function  either  succeeded  or  failed.  When 
new  invocations  of  the  same  (or  similar,  in  a  suitably  defined  sense,  again 
possibly  specific  to  a  given  domain)  functions  occur,  these  special  cases 


13 


» 


might  first  be  checked  to  avoid  repeating  errors,  and  to  choose,  if  possible 
the  correct  course.  Conditions  for  examining  such  sense  summaries  might  appear 
in  SENSEB  and  SENSEG  given  above.  To  facilitate  such  checking,  DS  may  also 
have, 

(history  DSL  SENSEH) :  List  of  all  other  DS-instances  of  the  same 
(or  similar)  function.  The  reader  should  note  that  the  instantiations  of  the 
relations  in  a  DS-instance  will  itself  cause  the  associated  sense  definitions 
to  be  invoked,  and  their  evaluations  will  produce  residues  which  might  them¬ 
selves  cause  an  entirely  new  sub-problem  solving  activity  to  take  place. 

(final  state  FSTATE  SENSEF) :  The  changes  performed  in  the  CDB 
during  the  tenure  of  the  DS-instance. 

(fn- state  FNS  SENSE?):  The  state  in  which  the  DS-instance  was 

finally  closed:  success,  failure ,  or  suspended. 

(successor  DS  SENSES))  successor  DS-instance,  if  any.  » 

The  relations  given  above  are  typical  of  what  one  might  want,  in  order  to 
meaningfully  describe  the  state  of  the  DESIGNER.  The  important  concept  to  notice 
is  that  different  such  DS  templates  might  be  defined  for  different  domains. 

The  network  of  instantiations  of  the  DS  template,  generated  during  the  course 
of  a  problem  solving  process,  would  constitute  the  problem  solving  protocol. 

This  protocol  will  not  only  contain  a  trace  of  changes  done  on  the  items  in 
CDB,  but  it  also  will  document  the  reasons  why  certain  courses  of  actions  were 
taken,  and  certain  others  abandoned.  For  each  DS-instance,  the  program  schema, 

[(ICOND)  A  (SUCSSM)  -+  FSTATE]  a 
[(ICOND)  a  (FAILSSM)  -*■  ICOND]. 

in  effect  summarizes  the  effect  of  the  DS-instance:  For  the  given  initial 
state  and  conditions  summarized  in  the  SUCSSM,  the  DS-instance  leads  to  the 
indicated  FSTATE  (final -state) ,  and  for  the  same  ICOND  and  FAILSSM  the  ICOND 
is  left  unchanged.  .This  program  schema  may  now  be  used  to  translate  a  protocol 
to  a  program.  One  would,  of  course,  choose  only  the  DS-instances  appearing 
in  the  successful  execution  path  of  the  protocol.  Each  such  program  will 
correspond  to  a  special  case  of  the  invoked  function. 

The  template  for  the  theorem  proving  state  may  also  be  similarly  used  to  guide  the 
theorem  prover  intelligently  in  a  domain  dependent  way.  It  is  worth  mentioning 
here  certain  important  features : 

In  MDS  the  problem  solvers,  in  fact,  generate  a  description  of  what  they  do 
as  they  solve  a  problem.  In  fact  the  problem  solving  process  is  itself  simply 
the  process  of  describing  what  the  MDS  is  doing.  These  descriptions  are,  how- 


14 


ever,  generated  in  a  highly  domain  dependent  way.  Again  the  notion  of  special¬ 
ization  comes  in.  Most  importantly,  summaries  of  the  problem  solving  protocols 
may  be  made  in  the  form  of  canned  programs,  with  characteristic  conditions  for 
their  invocation.  These  canned  programs  may  'iter  be  called  whenever  appropriate. 
In  this  sense  the  MDS  can  constantly  learn  am.  improve  itself.  Also,  clearly, 
it  is  being  used  to  do  automatic  programming  in  a  non-trivial  way. 

The  operations  of  the  TMPR  and  DESIGNER  are  discussed  in  fair  amount  of 
detail  in  Srinivasan  1973b  and  1973a.  Let  us  now  get  back  to  our  example. 

3.  Making  JOHN  Happy 

The  DESIGNER  receives  the  goal  (GOAL(JOHN  emotion  HAPPY)).  First  it 

checks  the  CDB  to  see  if  JOHN  is  already  HAPPY.  Then  it  searches  its  repertoir 

of  transformation  rules  to  see  whether  there  exists  a  transformation  to  make 

a  PERSON  happy,  since  JOHN  is  a  PERSON.  It  does  not  find  any.  So  it  simply 

issues  JR(JOHN  emotion  HAPPY)  ,  (JR  for  Justify  Relation) ,  to  the  INSTANTIATE. 

The  INSTANTIATOR,  of  course,  calls  the  CHECKER,  which  now  evaluates 

SnCDCnu  .  (JOHN  HAPPY), 

PERSON,  emotion  v 

which  in  our  case  is  SENSES  (see  page  R  ).  Since  none  of  JOHN'S  properties 
* 

are  known,  the  condition  evaluates  to  ?  ,  and  the  CHECKER  returns  to  the 
INSTANTIATOR  the  following  residue,  (Rl): 

(Rl) .  (PERSON  JOHN) 

((JOHN  attributes  RICH)  v  (JOHN  marriage -quality  COMPATIBLE)) 

-(JOHN  attribute  THIEF)  ~(JOHN  attribute  POOR)~(JOHN  marriage -quality 
LOUSY) . 

The  INSTANTIATOR  now  passes  this  on  to  the  TMPR,  since  it  has  the  JR  command. 

It  is  now  the  TMPR's  job  to  create  new  objects  satisfying  Rl.  Let  us  sup¬ 
pose  it  first  sets  up  the  goal 

(GOAL (JOHN  attributes  RICH)). 

When  the  DESIGNER  gets  this,  it  finds  the  transformation  rule  (TR1 )  (see  page0  ). 
Now,  if  the  bank  robbery  and  the  following  ASSERT  statements  are  both  success¬ 
fully  completed  in  TR1 ,  then  JOHN  will  be  RICH.  However,  now  the  IFDON  statement 
should  also  be  executed.  This  makes  JOHN  a  THIEF  and  hence,  not  HAPPY  (this 
violates  Rl ) .  This  approach  should  therefore  be  abondoned.  During  this 
process  all  the  sense  summaries  and  problem  conditions  would  have  been  recorded 
in  the  various  DS-instances  created  by  the  problem  solvers. 


Now,  the  next  possibility  is  to  get  JOHN  married,  and  make  the  marriage 
compatible.  So  create  a  MARRIAGE  for  JOHN.  To  complete  this  marriage,  another 
partner  (a  woman)  has  to  be  found  according  to  SENSE6  (see  page  8  ) .  The 
woman  has  to  love  JOHN.  Also,  since  the  marriage  quality  should  be  compatible, 
the  woman  should  not  be  more  than  5  years  older  than  John  and  also  should  share  i 
hobby  with  John  (From  SENSE7) .  What  is  the  age  of  JOHN?  What  are  his  hobbies 

There  are  no  sense  definitions  associated  with  a  PERSON'S  age  and  hobby.  So, 
ask  the  user. 

Now,  if  permitted,  the  TMPR  can  create  a  new  FEMALE  PERSON  with  the  ap¬ 
propriate  properties  (to  love  John,  be  not  more  than  5  years  older  than  John, 
and  share  a  hobby  with  John), and  marry  John  off  in  order  to  make  him  happy. 

If  the  MDS  is  advised  not  to  create  women  like  that  (this  condition  can  be 
imposed  by  advising  the  MDS  that  only  the  available  resources  in  the  CDB  may 
be  used  to  solve  the  problem) ,  then  the  system  will  now  merely  put  John  as 
being  happy,  arc.  associate  with  his  happiness  the  residue  (Rl)  as  a  condition. 
Later,  when  moi j  properties  of  John  becomes  available,  the  system  will  check 
whether  the  conditions  on  John's  happiness  are  being  satisfied.  (But,  of 
course,  bachelor  John  could  well  land  in  a  LOUSY  marriage  later  on  and  loose 
his  happiness!) 

If  the  job  had  been  successfully  completed  (by  creating  a  woman),  then 
essentially,  the  program  generated  from  the  protocol  would  say. 

Making  a  PERSON  happy: 

(PERSON  properties  unknown) 

(ASK  FOR  PERSON'S  age) 

(ASKFOR  PERSON'S  hobby) 

(CREATE  PERSON'S  marraige,  and  make  it  compatible 
by  creating  an  appropriate  partner  for  PERSON) . 


in  somd  suitable  programming  language  (which  could,  of  course,  be  the 
command  language  of  the  INSTANTIATOR,  together  with  some  facility  to  invoke 
them  conditionally).  In  the  execution  of  this  program  one  may,  if  desired, 
entirely  suppress  the  CHECKER.  Since,  for  the  conditions  satisfying  the  program 
it  is  known  to  succeed  without  creating  any  contradictions  in  the  CDB.  One 
has  to,  of  course,  include  within  the  program  the  steps  for  creating  a  woman 
such  that  the  marriage  is  compatible. 

Briefly,  this  illustrates  the  essential  concepts  in  MDS,  the  organization 


of  the  MDS  and  its  operational  features.  The  significant  innovations  are  the 
following: 

(i)  The  concept  of  the  description-schema ,  and  a  system  organization  where 
every  aspect  of  the  system's  functioning  adapts  itself  to  the  description  schemas 
One  may  think  of  these  schemas  as  representation  strategies  for  domain  depen¬ 
dent  knowledge  and  problem  solving  techniques.  For  a  complex  domain  the  crea¬ 
tion  of  these  schemas  will  itself  be  a  formidable  problem.  The  MDS  can  help 
intelligently  in  this  task. 

(ii)  The  Coherent  Data  Base  operates  on  a  three-valued  logical  system.  As 
shown  in  Srinivasan,  1973b,  it  is  this  thTee-valued  logic  feature  that  makes 
constructive  proofs  in  a  domain  possible. 

(iii)  The  problem  solving  process  is  itself  viewed  as  a  process  of  descri¬ 
bing  what  the  MDS  is  doing.  From  the  description  of  the  way  a  problem  is  solved 
(the  description  has  more  information  than  a  program  trace)  the  MDS  can  generate 
a  program  for  sc  .ving  the  problem. 

There  are  several  new  concepts  in  the  MDS  organization.  Logical  conditions 
(the  sense  definitions)  are  used  in  MDS  as  programs  as  well  as  data.  (Generally, 
in  theorem  proving  systems  logical  conditions  are  used  only  as  data.)  The  struc¬ 
tural  organization  of  descriptions  themselves  build  into  the  system  a  lot  of 
logical  constraints.  The  description  structure  of  an  object  is  used  to  classify 
objects  in  a  domain  into  objects  of  different  kinds,  like  PERSON,  MARRIAGE, 
PEOPLE,  etc.,  .In  our  example.  This  classification  of  objects  in  a  domain  into 
objects  of  different  kinds,  later  aids  the  system  in  summarizing  its  own  prob- 

t 

lem  solving  experiences;  it  is  thus  capable  of  generalizing  what  happens  to  JOHN 
as  what  might  in  general  happen  to  a  PERSON  with  certain  properties. 

The  realization  of  the  MDS  as  a  working  system  will  significantly  advance 
the  art  .of  AI  as  well  as  the  art  of  automatic  programming.  In  the  next  section 
we  shall  attempt  to  place  the  MDS  work  in  perspective,  within  the  spectrum  of 
automatic  programming  systems,  as  viewed  from  the  point  of  view  of  a  specific 
classification  schema. 


4.  MDS  and  automatic  programmin 


is  saying  in  a  discourse  with  the  system.  It  should,  for  example,  be  able  to 
resolve  ambiguities  of  specification  to  the  extent  possible;  and  it  should  he 
able  to  identify  the  missing  pieces  of  information  in  a  given  context  of 
discourse,  and  seek  out  to  obtain  them.  Also,  whenever  necessary,  it  should 
call  on  available  canned  programs,  or  generate  by  itself  the  necessary  programs, 
to  solve  the  problems  it  encounters  during  the  course  of  its  interaction  with 
the  user.  Ideally,  it  should  be  able  to  improve  itself  by  experience.  The  MDS 
can  satisfy  all  these  requirements. 

To  do  all  this  in  a  given  domain,  the  AP  system  should  not  only  have  some 
expectations  on  the  nature  of  knowledge- -f act s ,  conjectures  and  procedures --in 
the  domain,  but  it  should  also  be  capable  of  automatically  invoking  the  appro¬ 
priate  pieces  of  information  within  a  given  context,  and  use  them  correctly. 

The  kinds  of  problem  solving  facilities  necessary  to  create  such  a  system 
exist  in  systems  like  STRIPS,  and  the  programming  facilities  necessary  to  create 
such  systems  are  available  in  CONNIVER  and  PLANNER  like  systems.  STRIPS  uses 
a  general  theorem  prover  for  problem  solving  and  is  thus  restricted  in  its  scope 
of  applications.  There  is  no  notion  of  domain  specific  specialization  in 
STRIPS.,  The  CONNIVER  and  PLANNER  like  system  enable  one  to  create  highly  specia¬ 
lized  domain  specific  problem  solvers.  But  the  programmer  has  the  responsibili¬ 
ty  to  build  all  the  problem  solving  systems.  In  MDS,  we  find  a  general  problem 
solving  facility  that  can  be  specialized  to  specific  domains.  We  have,  truely, 
the  concept  of  programming  over  a  knowledge  base. 

For  our  purposes  here  I  shall  classify  the  existing  works  in  AP-systems 
as  shown  below.  I  should  hasten  to  point  out  tha't  the  classification  given 
here  is  not  intended  to  be  complete;  on  the  contrary  it  is  meant  to  reflect 
one  man's  biased  opinions.  A  survey  of  the  AP-systems  appears  in  Balzer  [1972]. 

AP-EFFORTS : 


[1A]  Systems  with  general  problem  solving. 

[1A1]  Those  dominated  by  the  problem  solver. 

These  have  no  capacity  for  domain  specific  specialization.  Leading 
example  is  STRIPS  [Fikes  1972].  Some  of  the  other  examples  appear 
in  Darlington  [1973],  Manna  and  Waldinger  [1971],  Luckham 
Buchanan  [1973], 

[1A2]  MDS -Type :  The  general  problem  solver  is  driven  by  the  represen¬ 

tation  strategies  chosen  for  a  domain.  There  is  a  strong  sense  of 
domain  specific  specialization. 


[IB]  Programming  Systems. 

Leading  examples  are  PLANNER  [Hewitt  3972] ,  CONNIVER  [McDermott 
1973].  Useful  to  create  domain  specific  intelligent  systems.  Bu 
do  not  have  problem  solving  features  to  provide  intelligent  guide 
for  automatic  programming. 

[IC]  Systems  that  include  a  lot  of  domain  specific  knowledge 


[1C1]  Made  of  a  collection  of  canned  programs  which  expertly  encapsulate 
domain  specific  knowledge. 

[1C1A]  With  an  intelligent  interface  to  cleverly  select  the  appro 
priate  functions  in  response  to  problem  conditions. 

EX:  NONE. 

[1C1B]  With  no  such  intelligent  interface.  EX:  MAXSYMA 

[1C2]  Made  of  special  purpose  synthesis  routines  that  produce  highly  op¬ 
timized  code  for  given  classes  of  problems  in  restricted  domains. 

EX:  Wilkens  [1973,  included  in  the  Appendix]. 

Guard,  J,  [1972]. 

[1C3]  A  set  of  programming  conventions — as  for  example  in  structured 

programming--  together  with  a  well  organized,  automated  clerical 
systems  to  document  programs  and  fuide  users  in  debugging. 

EX:  BLrSS  [Wulf  1973],  Parras  [1971],  Wirth  [1973]. 

In  our  project  the  work  of  Welsch  [1973  a  b],  falls  in  this 
category. 

[ID]  Proving  properties  of  Programs 
[1D1]  General  Approach 

EX:  Jtaroshi,S.,  London,  Lukham  [1973],  Stanford  AI -Memo. 

[IE]  Programs  to  improve  programs 
[1E1]  General  Approach 

Darlington  5  Burstal  [1973]. 

[1E2]  Restricted  Approach 

Marvin  Pauli's  work  in  our  project  falls  in  this  category. 

Pauli  [1973]. 

In  this  schema  we  are  placing  the  MDS  in  a  class  by  itself.  In  our  dis¬ 
cussions  of  the  MDS  here  we  have  ignored  the  problem  of  design  of  a  language 
of  discourse  for  communicating  with  the  MDS  in  a  domain.  We  have,  in  fact, 
identified  a  way  of  defining  a  language  to  the  MDS  in  terms  of  a  mapping  from 
the  linguistic  units  in  a  language  (lexical  items  and  phrases)  to  items  in 
the  description-schema  of  a  domain.  The  language  understanding  process  is 
viewed  as  a  process  of  translation  from  utterances  in  a  language  to  models  in 
the  CDB.  This  model  building  process  may  use  the  full  problem  solving  power 


19 


of  the  MDS  to  effect  the  translation  process.  We  shall  report  our  finding?  in 
this  area  in  subsequent  reports. 

We  believe  that  the  MDS  can  bring  the  full  powers  of  a  general  problem 
solving  system  to  the  services  of  common  computer  users  in  different  domains 
of  discourse,  each  communicating  with  the  machine  in  a  language  appropriate 
to  the  domain. 


Acknowledgement . 


The  discussions  I  had  with  Balzer  were  very  useful  in  clarifying 
many  of  the  concepts  presented  here.  It  is  a  pleasure  to  acknowledge  this 
help. 


2.0 


REFERENCES : 

Balzer,  R.M.  (1972)  “Automatic  Programming"  ISI  Institute  Memo,  ITEM  1. 

Balzer,  R.  M. ,  et  al.,  (1973)  “Domain -Independent  Automatic  Programming," 

ISI-RR-73-14.  USC/ISI,  4676  Admiralty  Way,  Marina  Del  Ray,  Calif.  90291,  USA. 

Dahl,  Dijkstra,  Hoare,  (1972)  Structured  Programming,  Academic  Press. 

Darlington,  F.  and  Burstal,  R.  M. ,  (1973)  “A  System  Which  Automatically  Improves 
Programs ."  Proc.  3rd  IJCAI  Conf . ,  pp.  479-485. 

Fikes ,  R.  E.  and  Nilsson,  N.  J. ,  (1972)  “STRIPS:  A  New  Approach  to  the 

Application  of  Theorem  Proving  to  Problem  Solving,"  J.  Art.  Intel.  3(1), 
pp..  27-68,  April. 

Guard,  Jim,  et  al.,  (1972)  “BASIS/APG  Users  Guide,"  An  automatic  Program 
Generation  System  for  Business  Information  Processing,  Applied  Logic 
Corporation,  Princeton,  N.  J.  08540 

Hewit,  Carl,  (1972)  "Description  and  Theoretical  Analysis  ...  of  PLANNER: 

.  .  .  ."  Ph.D.  dissertation,  M.I.T.  AI-TR-258. 

Hoare,  C.  A.  R.,  (1969)  "An  Axiomatic  Basis  for  Computer  Programming," 

Comm.  ACM  12,  pp.  576-580,  583. 

Jtaroshi,  S.,  London,  Lukhara,  (1973)  "Automatic  Program  Verification  I:  A 

Logical  Basis  and  its  Implementation."  Stanford  Art.  Intel.  Memo  A. I. M. -200, 
May. 

Luckham,  D.  C.  and  Buchanan,  J.  R. ,(1973)  "Automatic  Generation  of  Simple 
Programs;  a  Logical  Basis  and  Implementation,"  Al  Project  Report, 

Stanford  University. 

Manna,  Z.and  Waldinger,  R.  J.,  (1971)  "Toward  Automatic  Program  Synthesis," 

Comm.  ACM  14,  pp.  151-165, 

McDermott,  Drew  V.  and  Sussman,  G.  J. ,  (1973)  Son  of  Conniver,  The  Conniver 
reference  manual.  Version  II. 

Pamas,  D.  L. ,  (1971)  "A  Technique  for  Software  Module  Specification  with 
Examples ."Carnegie  Mellon  University,  March. 

Pauli,  Marvin,  (1973)  "Procedures  for  Formulating  and  Improving  Algorithms," 

Dept,  of  Comp.  Sc.  Tech.  Report  (December)  Rutgers  University,  New  Brunswick, 
N.J.  08903 

Srinivasan,  C.  V.,  (1973a)  "The  Architecture  of  Coherent  Information  System: 

A  General  Problem  Solving  System,"  Proc.  of  3rd  IJCAI  Conference,  pp.  218-228. 

Srinivasan,  C.  V.,  (1973b)  "A  New  Approach  to  Theorem  Proving:  Proof  by 
Synthesis ."  Dept,  of  Comp.  Sc.  Tech.  Report,  RVCBM-DS-TR26,  November. 


Welsch,  L.  ,  (1973)  "Correctness  of  Lock  and  Unlock  Primitive?  in  Hydra," 

Dept,  of  Comp.  Sc.  Technical  Memo,  Rutgers  University,  New  Brunswick,  N.J. 
November. 


Wilkens,  E.,  (1973)  "Realisation  of  Sequential  Machines  Using  Random  Access 
of  Memory:  Part  I."  Dept,  of  Comp.  Sc.  Report,  Rutgers  University, 

New  Brunswick,  N.  J.  08903 

Wirth,  N. ,  (1973)  Systematic  Programming:  An  Introduction,  Prentice  Hall,  1973. 

Wulf,  Cohen,  Corwin,  et  al.,  (1973)  "HYDRA:  The  Kcrnal  of  a  Multiprocessor 
Operating  System,"  Camegie-Mellon  Computer  Science  Dept.,  June. 


SOSAP-TM-IO 
December  1976 


THE  BLIND  HAND  PROBLEM 
T.  Hsu 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

New  Brunswick,  New  Jersey 


This  research  was  partially  supported  by  the  Advanced  Research 
Projects  Agency  of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6 
to  the  Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U,  S.  Government. 


Faye  2 


THR  FLUID  M  A  f! D  PROrLRM* 

Fy 

Tau  Hsu** 
December  1976 


AFSTP.ACT :  Three  kinds  of  representations:  the  higher  order  lo<"ic  in 
Darlivton'  s  system,  the  APSTKIPS  in  Sacerdoti's  oyster  and  'ieta 
Description  Systen  in  Srinivasan's  systen  are  investigated  usir.r  the 
blind  hand  problem .  The  advantages  and  disadvantages  of  each 
representation  are  discussed. 


I .  IiJTRODUCTIOH 


In  this  short  note,  three  kinds  of  representations:  the  hi-her 
order  loyic  in  Darlinyton's  system,  the  AF STRIPS  in  Sacerdoti's  system 
and  Meta  Description  Systen  (KDS)  in  Srinivasan's  svster  are 
investigated  usiny  the  blind  hand  problem.  Sone  difficulties  of  each 
representation  are  discussed. 


The  blind  hand  problem  can  be  stated  as  follows: 


There  are  two  places  termed  "here"  and  "there"  and  a  “eehr.nical 
hand  with  three  actions,  namely  "pickup",  which  causes  a  randomly 
selected  object  at  the  place  of  the  hand  to  be  held;  "lefo",  which 
^*reS\Jlts  in  the  hand  bein'*  empty;  and  "«*o",  which  moves  the  the  hand 
to  a  place.  In  the  initial  state  sO  there  are  red  things  and  oni"  red 
thinrs  "here",  and  the  c;oal  is  the  condition  that  at  least  one  red 


"This  work  was  supported  by  rrant  fron  the  Advanced  Research  A. ~e nc»  (GltA.t? 
tIO.:  DAHCIS-73-G6 )  ,  of  the  Governnent  of  the  United  States  or  America. 

"'vjept.  of  Computer  Science,  nutters  University,  New  Frunswick,  tew  Jersey 
08903 


*  TJ'E  r-LlN'D  HAI'D  PROTLE.*  .  T.  Ksu  ...Dec.  1976  Pare  3 

thing  1 3  "there".  The  location  of  the  hand  in  state  30  i3  unknown. 
The  problem  is  to  design  a  sequence  of  actions  to  achieve  state  in 
which  the  goal  condition  holds. 


II. THE  HIGHER  ORDER  LOGIC 


The  Elind  Hand  example  in  Darlington's  naper  (1971)  hr.s  been 
solved  by  using  a  second  order  logic  theorem  prover.  Darlington  ha3 
stated  several  advantages  of  this  representation:  first,  it  in  a  very 
neat  representation.  Especially  the  frame  problem  can  be  easily 
solved  by  adding  sone  extra  rules  as  'frame  axioms'.  Secondly,  it  has 
a  shorter  proof  than  the  first  order  lorie  and  most  strategies  used  in 
the  first  order  logic,  e.g.  set  of  support,  restrictive  strategy,  can 
also  be  used  in  the  higher  order  logic.  In  addition  to  using  the 


unification  algorithm  of  the  first  order  logic,  a  more  powerful 
unification  algorithm  called  'f-riatchinr  mode'  is  invoked  whenever  the 


system  fails  to  find  the  most  general  unifier  by  applying  the  first 
order  unification  algorithm.  Thirdly,  the  plan  can  be  automatically 


built  by  the  mechanical  theorem  prover  when  the  ^iven  ”oal  i r  proved. 


The  main  task  of  solving  this  problem  is  to  automatically 
maintain  eonsistenev  of  the  information  in  the  systev,  i.e. 
^•automatically  update  the  information  bv  system.  For  example,  in  the 
blind  hand  nro'olem,  when  the  mechanical  hand  ~oes  to  a  new  plrce.  then 
whatever  was  held  by  the  mechanical  hand  will  also  change  to  the  new 
place.  In  the  second  order  lo^ic  we  can  >rite  this  as: 

f(thinmsat(x1,ro(x1,s1)))  *  f ( th innsat ( x 1 , s 1 )  U  thin-sheld ( s 1 ) ) 

Which  says  the  things  at  place  xl  after  the  mechanical  hand  'go'er. 


to 


T  H  K  PLI.dD  HAT'D  PROPLEN 


T.  i'su  ...Dec.  1  97  6  Pare  ^ 


x 1  in  state  si  will  be  union  of  the  things  original  at  xl  and  thinrs 
held  by  hand  in  state  si.  Iloreover,  all  the  properties  of  the  thin  as 
at  xl  after  action  ' r o *  are  the  sane  as  that  of  things  under  union 
operation.  (f  is  the  function  variable  ra"rinr  over  nronerties  of 
sets.  ) 

In  the  above  rule,  lots  things  have  been  said:  first,  this  rule 
includes  the  set  concept,  e  .  <?  .  the  expression  thinrsat(x1,s1)  will 
return  a  set  which  contains  all  the  thinrs  at  xl  in  state  si,  it  is 
very  convenient  to  have  this  kind  of  operator;  A  ?  S  T  R I  P  S  is  weak  in 
this  feature.  secondly,  it  describes  what  is  changed  when  action  '"o' 
is  invoked.  Finally,  it  says  what  is  unchanged. 

A  special  feature  of  this  problem  is  the  'nondetern ini stic 
feature'..  The  action  'PICKUP'  deals  with  random  choice.  It  causes 
some  uncertainty  of  a  plan  which  makes  the  problem  difficult. 

The  final  plan  that  Darlinrton  "enerated  is: 

r  o  ( t  h  e  r  e  ,  oickup(mo(here,  letco(so)))) 

This  solution  will  not  guarantee  to  work  if  initially  the  hand  is 
'here'  and  holds  a  thinr  which  is  not  red.  The  correct  ''lan  should  be 
•"oCthere,  nickup(  ro(  here  ,  letro(ro(there,  s  0  )  )  )  )  ) 

This  defect  arises  because  there  is  no  'AT'  predicate  in  the 
Darlinpton's  system,  hence  it  is  needed  to  use  'ro(here,s) '  to  exnress 
implicitly  where  the  location  of  the  hand  is.  Otherwise  this  special 
situation  may  conflict  with  the  rule  on  '  ^  o  ’  ,  and  the  initial 
condition  which  says  that  there  are  only  red  thinrs  'here'. 


THE  f!  LI  ''D  HAND  PRO  PL  KM  ... 


...  T.  Hsu  ...Dec.  1976  Pf e  9 


1 1 1. A  POUT  A  POT  HI  PS 

(  1  ) . Introduct ion 

In  A  F  STRIPS,  an  action  is  represented  in  the  form  of  three  lists 
i.e.  a  precondition  list,  an  addition  list  and  a  deletion  list.  tach 
list  contains  a  set  of  wel  1- f  ori.i  .;d  f  orsulas  ( wf  f )  and  each  wff  in  the 
precondition  list  has  been  assigned  some  criticality  value  bv  the 
systen,  according  to  the  importance  and  the  difficulty  of  the  vff 
comparing  with  some  user  predefined  criticality  value  of  sore  atonic 
vffs.  Usin'?  these  lists,  it  avoids  usinp  the  situation  variable, 
since  it  assumes  that  the  truth  values  of  only  the  assertions  in  the 
add  list  and  delete  list  will  changed.  Vhen  the  system  attempts  to 

build  a  plan,  it  will  try  to  accomplish  the  wff  which  has  the  highest 

criticality  value  first.  The  difficulty  of  satisfying  a  roal 

increases  with  the  criticality  value.  This  biases  the  system  towards 
rejecting  unfeasible  plans,  resultin’’  in  a  smaller  planning  space  and 
hopefully  a  more  efficient  system.  Also,  to  maintain  efficiency  the 
system  tries  to  avoid  usinr  negated  predicates.  An  example  of  this 
occurs  in  the  use  of  two  predicates  ’ 3  T  A  T  U  S ( x  ,  C  L  0  3  3  )  *  and 

’ S  T  A  T  U  S ( x , 0  P  S  W ) *  instead  of  'ST«TUS(x,  CLOSE)'  and 

^.'fW(STATl?S(x,CLO.c>S)  )  *  .  To  enable  the  system  to  assign  criticality 
values  prone rlv  to  the  wff,  the  system  needs  an  extra  axiom  in  the 
world  model: 

(ALL  x)STATUS(x,  CLOSE)  <=>  NOT  (5TATUS(x,  OPEN)) 

(2).  The  modified  version  of  the  blind  hand  problem 


THE  FL1 f'  D  H AMD  PROELEf' 


T .  Hsu  .  .  . Dec  .  1976 


P  a  r  e  0 


The  original 
3  .  The  modified 
representation  of 
two  versions  is 
instead  of  random 


blind  hand  problem  will  be  discussed  in  the  section 
blind  hand  problem  will  bo  described  in  terms  of  the 
AESTRIPS  first.  The  onlv  difference  between  these 
in  the  action  ’pickup’.  In  the  modified  version, 
pick-up,  we  could  specify  the  thinr  we  want  to  nick 


up . 


We  shall  first  define  the  followinr  predicates: 


A.  T  (  x  ,  v)  :  True,  if  x  is  at  place  v;  false,  otherwise. 
HELD(x):  True,  if  x  is  held  by  the  hand;  false,  otherwise. 
HOT HELD:  True,  if  the  hand  does  not  hold  anything; 
false,  other  vrise. 

RED(x):  True,  if  x  is  red;  false,  otherwise. 

TYP E ( x , y ) :  True,  if  x  has  the  type  y;  false,  otherwise. 


For  efficiency,  we  defined  both  HELD(X)  and  "OTHELP.  The  extra 
axiom:  (ALL  x ) EOT ( FELD ( x ) )  <=>  FOTHELD  ,  is  also  defined  for  the  sake 
of  completeness.  The  operators  are  defined  as  follows: 


0 0 ( X ) :  The  hand  moes  to  place  x. 

Preconditions:  A  T  (  h  a  n  d  ,  $  1  )  ,TYPE(C1,PLA.  CE)  ,  TYP  S(x,  PLACE) 
add  list:  A T ( h a n d , x ) 
delete  list:  A  T  (  h  a  n  d  ,  $  1  ) 


f  10 V L ( o b j , x ) :  Hove  OBJECT  obj  to  PLACE  x. 

Preconditions:  A.  T  (  h  a  n  d  ,  ?  1  )  ,  A.  T  (  o  b  j  ,  0  1  )  ,  HELD  (obj)  ,  T  Y  P  E  ( 1  ,  P  L  A  C  E  )  , 
TYP E ( x, PLACE ) ,TYPE (obj , OP JECT) , 
add  list:  A.  T(hand,x),*AT(obj,x) 
delete  1 i s t : A T ( hand , 1 1 ) , AT ( o b j  ,  $  1  ) 

(where  •  denotes  the  predicate  which  is  the  main  purpose  for 
applying  this  operator.) 


PICKUP(x):  The  hand  picks  up  the  object  x. 

Preconditions:  LOTH  ELD , AT( hand , £ 1 ) , AT( x , $  1 ) , TYPE ( x , OP JECT )  , 


Hi’  r LIN'D  LAND  PROBLEM  . 


T.  I!su  ...Dec.  197  6  Pare  7 


TYPE(M  .PLACE) 
add  li st : H5LD( x) 
delete  list:  DOT HELD 


LETGO:  Release  anythinr  held  bv  the  hand 
Preconditions:  H E  L  D ( $  1 ) 
add  list:  NOTH ELD 
Delete  list:  DELD(vl) 


For  the  initial  conditions,  \ie  vant  to  say  that  there  are  only 
red  thinrs  at  'here'.  In  the  second  order  logic,  we  could  v.'rite  like: 


NOT [INTER SECT ION (anyof( thinrSat(here,sO)),redthinrs)]=0 


In  APSTRIP3,  since  it  does  not  include  the  set  conceDt  and  set 
operators,  we  nav  write  some  thing  like: 

(EXIST  x ) [ TYPE ( x , OP J  ECT)  P.  I*  D  AT(x,here)]  AND 
[(ALL  x)  ( (TYPE( X ,0 EJECT)  A K D  AT(x,here))  ->  RED(x))] 

There  i3  still  one  problem  .  Since  the  right  part  of  the 
conjunction  is  only  true  for  the  initial  conditions,  re  have  to 
introduce  a  time  parameter.  In  Darlington’s  system,  this  time 
information  is  embedded  in  the  state  variable.  For  simplicity,  let's 

just  define  the  initial  v.’orld  nod  el  with  all  the  instances  which 

*■>» 

satisfied  the  initial  condition  as  folio us: 


TY?E(hcre, PLACE) 
TYPE( there, PLACE) 
TYPE (objl ,0rj) 
TYPE(obj2 ,0r J) 


AT(obj1 ,bcre) 
AT(  PAL’D,  there) 
RED(ob j 1 ) 
HELD(obj2) 


T!;t:  PLIND  HAflD  PROPLEK 


T.  lisu  ...Dec.  1976  Pa<~e  0 


resides  the  lack  of  the  set  concept  in  A  PST  HIPS,  it  is  usually 
al3o  necessary  to  avoid  disjunction  and  negation  in  wffs.  The  use  of 
negation  in  uff  would  cause  the  system  to  check  the  entire  data  base 
every  tine,  causin'’  a  considerable  waste  of  tine.  Usin'*  the 
disjunction  in  uff  will  potentially  cause  backtracking,  and  it  is  not 
clear  yet  how  to  assign  the  criticality  value  to  each  predicate  and 
how  to  proceed  with  the  control  flows  in  the  disjunctive  for m. . 


The  criticality  value  is  assigned,  first,  to  the  predicate?  which 
can  not  be  changed  by  any  operators.  In  our  case,  the  predicates  with 
the  highest  criticality  would  be  TYPE  and  RED.  Let  then  both  have  the 
criticality  value  6.  Then,  since  accomplishing  the  predicate  I  ELD 
requires  the  truth  value,  true,  for  the  predicate  AT  in  its 
precondition  list;  HELD  is  assigned  a  higher  criticality  value  than 

mrT  AT.  Let  us  assign  the  value  4  for  HELD  and  2  for  AT.  Fecause  we  have 
the  axiom: 


(ALL  X  H'OT  (  HELD  (  x)  )  <  =  >H0THF.LD 


HOT HELD  and  HELD  will  have  the  sane  criticality  value, i.e.  4. 


The  goal  of  this  problem  is: 

(EXIST  x )  [  TYPE(x.OFJKCT)  AND  AT(x,  there)  AED  Rl’D(x)] 


First,  we  instantiate  a  Skolem  dummy  variable, say  v,  to  delete  the 


existential  quantifier,  so  v:e  have 

L' 

L 


TYPE( v.OPJECT)  AED  AT(v, there)  A HD  RHD(v) 


TljF  r  L I  ?!  D  HAND  PilOP.LEM  .  T.  I!.«iu  ...Dec.  1976  Pare  9 

as  the  new  ~oal.  How  we  are  ^oin?  to  accomplish  the  new  r oal  by 
bindinr  v  to  some  value.  V'e  will  first  try  to  accomplish  the 
predicates  which  have  the  highest  criticality  value,  i.e.  TYPE  and 
PCD.  Hence  by  knowing  objl  is  red,  we  bind  v  to  objl  i-w-od  i  ate  1  v . 
Then,  we  try  to  accomplish  AT (objl, there)  which  is  not  true  in  our 
initial  world  model.  The  only  operator  to  accomplish  this  coal  is 
MOVE.  I.e  need  to  invoke  MOVE  (  obj  1  ,  there )  uhich  has  the  precondition 
list : 

AT(hand,$1) ,  At ( ob j 1  ,  $  1 ) ,  HELD ( obj 1 ) ,  TYPE ( t 1 , PLACE) , 

TYPEUhere,  PLACE),  TYPE  (  obj  1  ,  OPJECT  ) 

The  predicates  with  the  highest  criticality  value,  C ,  are 
TYPE($1 .PLACE)  ,  TYPE(there, PLACE)  and  TYPE ( ob j 1 , OP J ECT ) .  If  we  bind 
£1  to  'here',  then  they  are  all  satisfied  in  the  space  of  criticality 
value  6.  t’ext,  we  try  to  accomplish  the  predicate  with  the 
criticality  value  ^  ,  i.e.  flFLD(objl).  After  searching  all  the 
operators,  the  only  possible  operator,  PICKUP(objl),  is  invoked.  The 
precondition  of  PICKUP  is 

KOTKELD, AT( hand ,$1 ) , AT(obj 1 ,  $1 ) .TYPE (obj 1 ,0? JECT) 

where  T YP C ( ob j 1 , OE J ECT )  is  satisfied  already.  So  we  try  to  accomplish 
the  predicate  fiOTFELD,  which  in  turn  will  require  operator  LITJC.  Tv 
bindinr  Si  to  obj2  in  the  precondition  of  LETGC,  we  have  completed  the 
space  of  criticality  equal  to  U.  For  the  space  of  criticality  e r u a  1 
to  2,  the  following  predicates  which  are  left  over  from  the  previously 
applied  operators  have  to  he  satisfied: 


TEC  FLIMD  HA II D  PHOFLFM 


T.  Hsu  ...Dec.  1«76  Pa^e  10 


From  operator  PICKUP,  we  have:  AT(  hand  ,  %  1  )  ,  AT(  oh  j  1  ,  7  1  )  Pro-’  operator 
f<  0  V  E  ,  v  e  have:  A  T  (  h  a  n  d  ,  h  e  r  e )  ,  A  T  {  o  b  j  1  ,  h  e  r  e  ) 

The  easiest  way  to  satisfy  all  of  these  is  binding  h  1  to  "here",  and 
invoking  the  operator  GO(here),  which  has  the  precondition: 

AT(hand,.*1)  ,  T  Y  P  E  (  $  1  .PLACE)  ,  T  Y  P  G  (  h  e  r  e  ,  PL  A  C  E ) 

This  can  be  satisfied  by  substituting  Si  to  "there".  Then  all  the 
remaindin?  predicates  can  be  satisfied  automatically  without  invekinr 
any  more  operators.  Therefore,  we  have  built  the  folio w in-"  plans: 

LETGO ,  GO(here),  PI CKUP ( ob j 1 ) ,  M0VS(obj1,  there) 

where  the  order  of  LETGO  and  GO(here)  is  not  important. 

In  the  above  process  for  building  plans  in  AFSTHIPS,  we  could  see 
some  interesting  points: 

1) .  Feinr  quidinr;  by  the  criticalitv  values  of  predicates,  the  svsten 
bound  x  to  objl  immediately  and  correctlv  in  the  first  trv.  It  will 

^nevu  be  mislead  by  the  fact  that  obj2  is  already  at  olace  "there",  even 
thou "h  it  satisfies  a  part  of  the  •>oal. 

2) .  The  frame  problem  occurs  in  the  oner  a  tor  GO.  V.'hen  the  hand  'oes 
to  x,  vc  do  not  know  whether  the  hand  is  holding  somethinm  or  not.  if 
the  hand  is  holdinp  sonethinr,  then  this  thinr  must  also  change  to  the 
new  place.  Hence  an  extra  rule  which  interacts  with  the  action  'GO' 


T.  Hsu  .  .  ,L'(>c.  197  9  Pal’ll  11 


THK  r L I  D  HAND  P  R  0  P  L  £  f '  . 

« 

is  neoderi  in  the  world  model: 

(ALL  s  )  (  A  L  L  x  )  (  A  LL  v)(LXIST  z  )  [  li  KLD (  x  ,  s )  AMD  Af(x,v,r.)  -> 

A  T(  x  ,  z  ,  RSSULT(GO(s  f  s)  )  )  AND  DOT  (  AT  (  x  ,  v  ,  RESULT  ( GO  (  z  ,  r. )  )  )  )  ] 

where  a  situation  variable  s  is  attached  to  all  the  predicates  and  the 
function  'RESULT'  is  used  to  nap  an  action  to  a  situation[reference 
McCarthy  and  Hayes].  Usinr  the  situation  variable  increases  the 
complexity  of  the  control  process.  It  seems  that  a  better  aonroach  is 
to  add  a  vff  to  the  add  list  of  the  action  GO  as  follov;s: 

HELD(y)  ->  AT ( y , x )  AND  NOT ( A T ( y ,  $  1 ) ) 

Then  build  a  special  control  for  it. 

\’e  also  need  an  extra  rule  to  tell  the  system.  that  if  somethin^  is 
beinm  held  by  the  hand,  then  this  thin^  is  at  the  same  place  as  the 
hand  is ,  i . e . , 

(ALL  x)(ALL  y)[(PELD(x)  AND  A T ( h a n d ,  y))  ->  A T ( x , y ) ] 
hence,  in  the  world  model  there  are  usually  suite  a  few  axioms  'nan^inr 
around.  Each  tine  when  an  action  is  done  or  some  predicates  are  bein’ 
updated,  v:e  need  to  check  through  all  of  these  axioms.  f  heavy  rricc 
is  paid  for  this  in  loss  of  efficiency. 

3).  One  rood  point  in  delaying  the  substitution  of  the  unknown 
variable  is  to  avoid  the  backtrackin'*.  In  the  above  example,  mile 
invokinr  the  operator  PICKUP,  v: e  did  not  bind  7 1  in  the  space  of 
criticality  since  we  did  not  have  anv  information  for  binding  at 


THE  i  LIN'D  l!  A  CD  P  R  0  P  L  •  *  .  T.  I'su  ...Dec.  1976  l’a^i  12 

that  point.  l-e  wait  until  the  space  of  criticality  2,  where  we  have  a 
mood  reason  to  bind  ?1  to  "here". 

(3). The  original  blind  hand  problem 

Now,  we  consider  the  original  blind  hand  problem  in  which  the 
hand  randomly  picks  up  an  object.  Thus,  it  does  not  need  an  arru-ynt 
for  PICKUP.  The  new  PICKUP  will  look  like: 

PICKUP 

Pred:  KOTKELD,  (EXIST  x)(AT(x,$1)  &  TYPE ( x , OP J 2CT ) ) ,  AT(hand,  .-1) 
add:  HELD(x)  where  x  is  one  of  objects  at  place  $  1 . 

del:  U  G  T  H  2  L  D 


An  ad-hoc  way  to  solve  this  problem  is  usin'?  the  sare  plan  we  had 
in  the  modified  version  repeatly  removing  one  object  fro-  "here"  to 
"there"  until  no  objects  at  place  "here".  Since  initially  there  is  a 
red  object  at  place  "here",  therefore  there  is  a  red  ooject  at  place 
"there"  in  the  final  condition. 

In  order  to  solve  it  in  a  more  formal  way,  we  need  to  a J d  the  set 
concent  and  set  operator  in  APSTHIPS  and  to  tell  the  system  cone 
heuristics  for  ouidinr  the  control  flow. 

L. 

First,  we  want  the  system  to  know  that  the  only  wav  to  nuarantcc 
that  the  hand  will  always  nick  ur>  a  red  thine  at  so*e  place  is  bv 
rakin'*  sure  that  there  are  onlv  red  things  at  that  place.  >e  could 
write  this  as  an  extra  axiom  with  a  situation  variable,  r. ,  nnd 


function,  RESULT,  as  mentioned  in  the  last  section. 


Tl'i:  rLII.'D  HA.rll)  PROPLC' . . .  T.  |Ir.u  ...Dec.  1970  13 

(ALL  s)[ ((ALL  n ) ( AT ( hand  ,  o  ,  s )  h  (ALL  x ) [ A T ( x , n , s ) -> K KO ( x , s ) ] )  -> 

(ALL  x)[i:EL0(x,RE5ULT(PICKUP(s)))  ->  K  E»(  x  ,  Rf.SULT  (  P ICK  tj  P  (  s  )  )  )  3  ] 

The  next  thin"  v;e  wont  the  system  to  know  is  that  one  > •  a y  to 
Guarantee  that  there  are  always  only  red  thinrs  "here"  is  never  let 
the  hand  hold  anythin."  when  it  "oes  to  "here".  Since  initially  there 
are  only  red  things  "here".  Thus  v;e  have  another  rule: 

(ALL  s)  [!'OTHc:LD(s)  -> 

(ALL  x)[AT(x,here,RSSu*LT(GO(here,c)  ))  ->  F.  JD  (  x  ,  RES  ULT  (  GC  (  he  re  ,  s  )))]  ] 

The  control  structure  of  these  two  rules  is  cuite  ccnliccted. 
An  easier  wav  is  to  attach  these  kinds  of  strategy  rules  to  the 
related  actions,  e . " .  PICKUP  and  GO,  so  that  the  situation  variable 
can  be  onitted.  This  can  not  be  done  in  the  current  A f. ST  PIPS 
representation  scheme.  Another  difficulty  is  how  to  define  the 
properties  of  the  ar^unent  of  Y V. L D ,  v h  i c h  deals  with  the  rancor  choice 
from  a  set,  in  the  add  list  of  action  PICKUP. 

The  solution  of  the  original  blind  hand  problem  car.  be  •- narrated 
in  a  way  similar  to  the  procedure  in  the  last  section.  F  e  c  a  e  r  e  of  the 
two  extra  strate^v  rules,  the  hand  must  be  enoty  before  it  f-oes  to 
"here".  Since  we  have  a  rule: 

_  (ALL  x ) (ALL  y ) [ ( HFLD ( x )  ft  AT(hand,y))  ->  AT(x,y)] 

4t*- 

This  implies  initially  either  the  hand  is  not  at  place  'here'  or  the 
hand  is  holdinr  a  red  ooject  in  order  to  satisfy  the  initial 
condition.  Therefore  the  hand  does  not  have  to  "o  to  somewhere  else 
to  empty  it.  Thus,  the  plan  we  have  will  look  like: 

LFTGO,  GO(here),  PICKUP,  KOVF.(  there) 


Ti-C  H  LIN  b  t!  A  l’  D  PROCLRK  .  T .  Hsu  ...Dec.  1976  Pare  14 

The  IT  ST  DIPS  example  in  Sacerdoti's  paper  that  describes  hov;  t  he 
systen  can  build  a  plan  for  a  robot  to  push  a  box  from  one  roo:.’  to 
another  points  out  a  few  other  weaknesses  in  the  system 

1) .  For  one  action  ’GO'  in  the  robot  problen,  it  has  three  different 
type  ' G  0 ' e  s ,  namely:  ro  to  object  bx  ,GOTOP(bx)',  ^o  to  door  dx 
'GOTO(dx)',  and  ro  to  coordinate  location  (x,y)  ' C  0  T  o ( x  ,  v ) ' . 
Moreover,  it  has  an  action  in  order  to  mo  through  the  dcor  dx  into 
room  rx  called  *COTHRUDR(dx,rx).  fill  these  '  G  G  •  e  s  change  the  location 
of  an  object,  but  each  nav  cause  some  different  side  effects:  seme 
will  cause  the  object  to  be  in  a  new  room,  some  will  cause  the  object 
to  be  next  to  the  another  object,  and  some  w i 1 1  result  in  a  change  of 
the  location  of  an  object.  For  a  specific  interested  outcome,  the 
choice  of  an  action  is  determined  by  the  exnlicit  recognition  of  the 
kind  of  side  effects  that  are  desired.  Similarly  for  the  action  rush, 
they  have  push  box  bx  to  box  by  '  PUSHD ( bx ,  by )  •  ,  push  bx  to  c’oor  dx 
'?USHD(bx,dx)',  push  bx  to  coordinate  location  (x,y)  ' P  U  3 1 L ( b  x , x  ,  y ) ' , 
push  bx  through  door  dx  into  room  rx  'PUSHTU PUOR(bx,dx,rx)’.  This 
kind  of  representation  appears  to  be  too  heavily  specialized  to  the 
particular,  stvlized  example.  It  is  not  a  sc h erne  that  one  mi' h t  adapt 
in  reneral ,  for  representation  of  actions. 

•i  " 

2) .  It  needs  some  rules  to  enable  the  system  to  co~,nletelv  define  the 
criticality  value.  Sometimes  such  rules  do  not  make  too  much  sense 
for  us:  for  example,  it  has 


(ALL  x)[PUSHAri.F(x)  ->  TYPF.(x,  OUJLCT)] 


THE  RLP'D  HAND  PROBLEM 


T.  Hsu  ...Dec.  1976  r  a  ^  e  15 


which  seens  quite  hard  to  be  defined  completely  by  a  user. 

3) .  It  has  no  inverse  relation  avaiable,  hence  when  ve  delete 
NEXT(X,Y),  we  have  also  to  delete  NEXT(Y,X).  It  is  very  inconvenient 
and  the  svsten  needs  the  extra  storage  to  record  both  nredicates 
rrXT(x,y)  and  i.EXT(y,x). 

4) .  For  each  action,  before  execution,  it  needs  to  do  lots  of  tyoe 
checkine  since  usually  the  type  predicate  has  the  highest  criticality 
value.  It  spends  quite  a  bit  time  on  that. 

IV.  f!DS 

(1).  The  Domain  Definition  of  the  Plind  Hand  Problem 

In  M D S ,  m  a  n  v  of  the  above  problems  are  completely  avoided.  (In 
this  section,  we  assume  that  the  reader  has  a  basic  knowledge  of  “PS.) 

The  domain  definition  of  the  blind  hand  problem  fill  look  as 
fol lows : 


l  —  (TDV;  OBJECT  (isat  ( PLACE  F " )  locationof  CC 1 ) 

(color  (COLCfi  Hi.')  colorof) 

(heldby  (HAND  P!:)  holdinr)) 

Notice  that  the  relation  flaq  '  R  f ,  rt  e  r  u  1  a  r  Mode,  tells  the  system  an 
object  can  only  be  at  one  place,  onlv  have  one  color  and  only  be  held 
by  one  hand.  Fere  'locationof*  and  '  h  c  1  d v  y  ’  are  defined  as  t h e  names 
of  inverses  of  relations  'isat'  and  'holdinr',  respectively. 


CCI-OPJECT-isat: 

(CSCC:  (CUOTH  ((PLACE  X)  ! 

(((ALL  HAND  li )  MOT  ( H  holdinr  d)  1C  isat  X )  ) 


THE  P  L I r*  I)  HA  t!  D  PnOTLEM 


T.  Hsu  ...Pec.  1 9  7  C  Pace  16 


(  ( SOr'.K  HAT’D  U)(IJ  holding  ()(X  locationof  P  )  )  )  )  ) 
OPJECT  isat) 

CC1  here  is  a  consistency  condition  that  specifies  the  condition  which 
an  object  will  have  to  satisfy  to  be  at  a  certain  location.  CC1  nave 
that  if  the  object  is  not  held  by  any  hand,  then  the  location  of  that 
object  could  be  any  location  asserted  by  the  U3er  or  the  svrte-.  F  ut, 
if  it  is  held  by  a  hand,  then  the  location  of  that  object  is  the  sane 
as  the  location  of  the  hand. 


(101::  II A  “ID  (isat  (PLACE  RE)  locationof) 

(holding  ( DEJECT  fiti)  heldby  CC2  T  R 1  )  ) 


CC2-P!AKD-holdinr : 

(CSCC:  (QUOTE  ((OPJECT  X)  i 

(5  holding  X)(X  isatrlocationof  ? )  )  ) 
H  A  K  D  holding) 


TR 1  -HA  UD-hold  i.nrr : 

(OSTfi :  (QUOTE  (((T  ?)  (DCOPD 

( ( ( SOUS  OPJECT  X)(X  is  CLPVAL)) 

( I R  (OLDVAL  isat  (C  isat)))) 

(((SOME  OPJECT  X)(X  is  PEUVAL)) 

(IR  (I'Ei’VAL  isat :  *f  la^  1))))))) 

HAL'D  holding) 

CC2  says  that  if  a  hand  is  holding  an  object,  then  the  object  and  the 
hand  r.ust  be  at  the  sane  place.  The  anchored  transformation  rule  is 
invoked  only  for  ASSERT  or  IF:  (Instantiate  Relation)  co:w* and 8 .  If 
(ASSERT  (h  holding  b))  is  successful  (i.e.  h  was  assigned  as  the 
value  of  ( h  holding)),  then  the  CC  evaluation  would  result  in  the 
truth  value  T  or  ?.  In  this  case,  b  would  be  the  i'El-'VAL  of  Till,  and 
TR1  will  execute  (IR  (t'El'VAL  isat :  P  flae  1)),  i.e.  set  the  Cfla-  of 
relation  'isat'  of  the  object  b  to  *  1  *  .  t  hen  the  vfla<:  is  1  the  value 
of  the  relation  is  not  stored  in  the  r.odel  space,  but  is  computed, 
everytine  it  is  needed.  If  one  asserted  (LOT  (h  holdin^  b))  when  h 
was  initially  holding  b,  the  b  would  be  the  CLPVAL  of  Till,  and  in  this 
case  the  UEWVAL  will  be  ?.  Also,  the  CC 2  would  have  returne'*  the 
truth  value  ?.  Thus,  TR1  will  execute  (IR  (b  isat  (h  isat)))  which 
will  reset  the  f  1  a r  of  (b  isat)  back  to  0,  and  assign  the  location  of 
t hjB.  hand  h,  as  the  new  location  of  b. 


The  inverse  relations  are  defined  automatically  by  the  system.  Thus 
we  met, : 

(TNI:  ( COLOR  RK)  (colorof  (OPJECTS  CL)  color)) 


(TDU:  (OFJECTS  £L )  ( ELE'FDU  CPJCCT)) 


th.c  p  L I  n  i.'At.i)  p  port  I’M 


.  Ilsu  ...Dec.  1970  Faye  17 


(TDil:  (PLACE  KM)  (locationof  (OFJORHAID  :‘L)  isat)) 

(TDM:  (  OP  JO  fil!  AMD  OL)  (  ELHilDi’  OPJiiCT  HAIM))) 

Motice  that  nnnv  objects  r. ay  have  the  sane  color,  so  the  system 
automatically  creates  a  new  template,  caller*  * 0PJCCT3',  which  is  the 
collection  of  'OIJCCT’s.  For  the  similar  reason,  '  OPJO^HAXD '  is  also 
created.  The  names  for  these  nee  templates  are  declared  by  the  user. 

The  CC's  and  TR's  in  the  above  domain  are  used  b”  ?..QS  to 
establish  and  maintain  a  consistent  model  soace  for  the  domain.  In 
the  case  of  the  ELI11D  HAND  domain,  such  a  model  space  will  contain 
soecific  instances  of  HANDS,  PLACES,  OEJECTs  and  COLORS,  and  relations 
that  relate  these  instances  as  per  constraints  specified  by  the 
template  and  consistency  conditions.  For  a  discussion  of  the  way  MDS 
uses  the  above  CC's  and  TR  to  maintain  consistency  in  the  model  space 
see  Srinivasan  [February  1976].  I'e  shall  discuss  below  onlv  the 
aspects  of  MDS  operations  relevant  to  our  example. 

(2).  Some  comments  on  conventions  in  MDS 

Let  h  be  an  instance  of  H  A  I.  D  in  the  model  space,  r  a  PLACE,  b  an 
OFJECT  and  c  a  COLOR.  Let  us  consider  the  constraint  CCI-CfJCT-isat. 
-TFTTs  constraint  has  the  following  form: 

( ( PLACE  X)  !  P( C  X) )  , 

where  P(E  X)  is  a  predicate  expression  with  two  free  variables:  '  and 
X.  X  is  called  the  set-variable,  and  !.'  is  called  the  anchor  of  the 
CC.  For  an  OPJECT,  b,  if  one  asserts,  (b  isat  p),  then  MDS  would  bind 
the  anchor  variable  6  to  b  --  the  anchor  variable  is  always  bound  to 


T‘;K  I’LI:;d  f'AMD  PROPLF.m  .  T.  Isu  ...Dec.  197  C  Part!  Id 

the  current  instance  at  which  an  assertion  is  beinr  made  --  and  the 
ret  variable  X  to  p,  and  evaluate  the  predicate  P  ( (■>  X)  in  CC1.  If  the 
predicate  is  satisfied  then  the  assertion  will  be  accepted. 

Predicates  like  P  (  9  X)  are  evaluated  in  the  MD5  model  space  in 
3-valued  lo^ic:  True,  Unknown(?)  and  MIL.  For  example,  if  the 

set-variable  X  in  P(C  X)  is  unknown,  then  the  truth  value  of  ' ( i  isat 
X)'  appearing  in  CC1  will  be  hypothesized  to  be  unknown  in  the 
evaluation  of  CC1.  In  this  case,  if  there  is  a  HAL'D,  h,  such  that  h 
is  holding  b,  then  during  the  evaluation  process  of  CC1,  X  will  ’  e  t 
bound  to  the  location  of  h ,  as  a  result  of  the  expression:  * ( ( S C [ : E 

HA ’ID  IT )  ( H  holding  P)(X  locationof  f; )  )  •  .  In  this  case  the  evaluation 
of  CC1  will  return  the  location  of  h  as  its  value.  Also,  in  this 
case,  if  one  asserted  that  (b  isat  o)  for  a  PLACF  c ,  that  is  different 
from  p,  then  predicate  in  CC1,  namely  '((ALL  HAJID  H )  ( IT  0  T  ( H  holding 

■5))  ( G  isat  X))  OR  ((SOME  fi  AIID  H )  (H  holding  9)  (X  locationof 
will  evaluate  to  MIL.  If  there  is  no  HAMP  holding  b,  then  CC1  vill 
accent  any  assertion  of  t  e  form  (b  isat  q)  for  any  c.  The  reader  ’ay 
similarly  exanine  the  interoretation  of  CC2- F A :!D -holding.  In  general, 
if  C  C  [  X  r]  is  the  consistency  condition  associated  with  the  relation  r 
of  a  template  X,  then  in  HDS  CC[X  rj  is  evaluated  as  a  function  of  two 
^iai^n.’.ments :  CC[X  r]  (G  Y),  where  0  is  the  anchor  variable  and  Y  is  the 

set  variable.  The  evaluation  of  CC[X  r  ]  (  G  Y)  will  return  the'  b  1  n  d  i  n  ^  s 
for  Y  together  with  the  truth  value  of  the  predicate  in  C  C  [  X  r ] .  This 
truth  value  nay,  of  course  be,  T(True),  ?' Un known )  or  fTL(ralse).  In 
r,eneral,  CC[X  rj  has  the  folio  win"’  interpretation: 

( G  r  Y)  <->  CC[X  r ] ( d  Y) . 


H  E  TLIMJ  HAND  PROrLtifi 


T.  Hsu  ...Dec.  1976  Pa  me  19 


The  KDS  model  space  will  not  accept  assertion  that  oroduce 
contradictions  in  the  CC's  defined  for  a  domain. 

The  transformation  rules  like  the  rule  T R 1 - H A N D - h o 1 d i n "  in  the 
BLIND  HAND  domain  are  used  in  KDS  to  perform  the  side-effects  that  may 
be  caused  as  a  result  of  accepting  an  assertion  into  the  model  snace. 
The  specific  side  effects  may  of  course  depend  on  the  truth  va.ue 
produced  by  the  CC  evaluation,  of  the  CC's  associated  with  a 
transformation  rule.  In  the  BLIND  HAND  domain  TR 1 -F A N D-hold i n ^  rill 
be  invoked  by  I-iDS  after  CC2-H  AHD-holdiny  is  evaluated.  Depending  uoon 
the  truth  value  returned  by  CC2-K  t.i'  D-hold  in. z  the  action  prescribed  in 
the  rule  are  executed,  as  discussed  before. 

After  completing  the  domain  definition,  the  ' :  D  S  will 
automatically  build  the  DOH-LISTs  and  DET-LISTs.  Let  CC[X  rj  be  the 
CC  at  (X  r) .  Then  the  DON-LIST  of  (X  r)  is  the  list  of  all  anchors  (Y 
m),  such  that  (Y  m)  occurs  in  CC[X  r] .  In  other  words,  the  evaluation 


of 

the  CC[X  r] 

will 

call 

for  the  value 

of 

( Y  n) 

for 

some 

or 

all 

ins 

tances  y  of 

Y, 

in  the 

model  space. 

For 

every 

(Y  n) 

that  o  c 

curs  in 

the 

DON-LIST  of 

(X 

r),  the  anchor  (X  r) 

i  t 

self 

will 

occur 

i  n 

the 

DCT-LIST  of  (Y  m).  This  has  the  follwim  interpretation: 

t  —  Let  y  be  any  instance  of  Y,  and  let  [x]  be  the  set  of  all 
instances  of  X  in  a  model  snace.  Then,  every  time  the  value  of  (v  m ) 
is  changed,  in  order  to  maintain  the  consistency  of  the  model  mace, 
it  may  be  necessary  to  check  the  CC's  at  every  (x  r),  for  every  x  in 
[  x  ]  .  Of  ' u r s e ,  for  3  particular  y,  only  a  subset  of  [x]  may  denend 
on  the  value  of  (y  n).  To  identify  this  subset,  we  shall  associate 
with  the  DHT-LIST  entry  (X  r),  a  constraint  of  the  form  ((X  x)  !  H  (  H 


Tr!R  PLI:'0  HAKD  PROBLEM 


T.  Hsu  .  .  .Dec.  197f)  Pa~e  20 


X)).  Constraints  of  this  kind  are  called  FILTERS  in  MDS.  If  a  filter 
is  available  at  (Y  n)  then,  when  (v  ri )  is  asserted  for  a  particular 
instance  y  of  Y,  the  CC's  at  the  anchors  (x  r)  will  be  checked  only 
for  the  objects  x  in  ((X  x)  |  P ( 0  x)). 

In  the  CLIHD  HAMID  dona  in,  there  are  only  two  CC's, 
CC 1  -OF  JCCT- i  sa  t  and  CC2-HAFD-holdinr  .  So  only  (OBJECT  isat)  and  (l.AMD 
holdinr.)  have  the  DO  I! -LISTs.  For  example,  the  D  C  i.  -  L 1 3  T  of  (GUCCI 
isat)  is  '((OBJECT  isat)  (HAfID  holdinr)  (PLACE  1  oc  a  t  i  o  no  f )  )  '  .  The 
D0H-LIST  of  (MAUD  holding)  is  '((0FJ2CT  isat)  ( i:  A  CD  boldine)  (PLACE 
locat  ionof )  )  '  .  (PLACE  locationof)  occurs  in  the  both  DCI'-LISTs  of 
(OBJECT  isat)  and  ( H  A : '  D  holding).  Thus,  the  D  E T - L I S  T  of  (PLACE 
locationof)  will  have  both  the  anchors  (  F  A  l !  D  holding)  and  ( 0  P  J  E  C  T 
isat),  with  associated  filters.  The  DET-LIST  and  DOC -LIST  venerated 
by  i;DS  for  the  FLUID  PAID  domain  are  shown  below: 


[  A  1  ]  : 

The  DOF-LIST  of  (OBJECT  isat)  is:  j 

((HA M D  holdinr)  (OBJECT  isat)  (PLACE  locationof))  ' 

The  DOT -LIST  of  (OBJECT  isat)  is: 

At  DET-anchor  (HAT'D  h o  1  d i n r  )  : 

((t'Al'D  Y)  I  (3  i  sat :  locat  ionof  Y)) 

_  _ _  i 

J.  «•]  : 

The  rOC-LICT  of  (iiADC  holdinr)  is: 

( ( FAIT  holdinr)  (OBJECT  isat)  (PLACE  locationof)) 

The  DET-LIST  of  (HAHD  holdinr)  is:  ( 

At  D FT- anchor  ( OP JCCT  isat ) :  ■ 

( (OBJECT  Y)  !  (fc  holdinr  Y) ) 


[  A  ?  ]  : 

Ho  DOE-LIST  for  (PLACE  locationof). 

The  DET-LIST  of  (PLACE  locationof)  is: 
1).  AT  DET-nnchor  (OBJECT  isat): 


I 

k 


m 


THE  PLT " D  H A t! D  PRCPLKM 


T.  Hsu  .  .  .  Dec.  1  9 7 C.  e;re  2  1 


(  (  0  F  J  H  C  T  Y )  !  (  ( f-  locationof:holdinn  Y) 

OH  (f  locationof  Y))) 

2).  AT  DET-anchor  (HAND  boldine): 

(  ( IT  A  r:  D  Y)  i  (?  locationof  Y)) 


[  A4]  : 
No 

DON-LIST 

and 

DET-LIST 

for  (COLOR 

colorof ) 

[  A  5  ]  : 

No 

DON-LIST 

and 

DET-LIST 

for  (OPJECT 

color ) 

[  A  6  ]  : 
No 

DON-LIST 

for 

(OPJECT 

heldhv ) 

The  DET-LIST  of  (OPJSCT  heldby)  is: 
At  DET-anchor  (HAND  holding): 
((HAND  Y)  !  (£  heldby  y)) 


C  A  7  ]  : 

No  DON-LIST  and  DEL-LIST  for  (HAND  isat) 


It  is  instructive  to  examine  the  above  DON -LISTs  and  DPT -LISTs 
v.'ith  reference  to  what  haopens  when  a  hand,  h,  holHin~  ar.  object  b, 
noves  from  one  place  to  another:  i.e.  when  one  n a k a s  an  assertion  (h 
isat  q)  for  a  place,  q,  when  initially  (h  isat  p)  was  true.  The 
followin'?  operations  will  result  in  i ;  D  S  : 

Th-e.  system  will  focus  attention  on  the  relation  (h  in  at),  (o 
locationof)  and  (n  locationof).  It  will  delete?  h  fron  the  value  of  (n 
locationof)  and  insert  h  in  (o  locationof),  while  at  the  sn.-c  tiro 
substituting  o  for  p  in  (h  isat).  This  in  effect  is  the  new  desired 
configuration.  (h  isat)  has  no  DET-LIST.  However,  (PLACE  locationof) 
has  two  PET-LIST  entries:  one  is  (OTJECT  isat)  with  the  filter  shown 


in  [A3]  above;  the  other  is  (HAND  holding),  also  with  an  associated 


THE  BLIND  l!AMI)  PHODLEi:  .  T.  Hsu  ...Dec.  107b  pnre  2? 

filter.  These  DET-LIST  v'ill  he  activated  when  (d  locationof)  and  (o 
locationof)  are  changed. 

The  DET-LIST  entry,  (0PJEC7  isat)  with  its  associated  filter, 
demands  that  when  (p  locationof)  is  chanced ,  all  the  objects 
satisfying  the  condition: 

((OBJECT  Y)  !  (p  locat ionof : hoi d inn  Y)  OR  (p  locationof  Y)) 

should  nov;  be  examined,  at  (Y  isat). 

Thus,  normally,  when  a  hand,  h,  moves  the  location  of  the  object 
held  by  the  hand  will  ret  examined  and  updated  if  necessary.  However, 
in  our  model,  for  the  object,  say  b,  held  by  h,  (b  isat:?fla.~)  is  M’. 
Thus,  the  location  of  b  will  always  be  computed  usinr  CC 1  -OF  J  EC  X- i  sat  , 
everytime  it  is  needed.  Therefore,  there  is  really  no  need  to  examine 
the  location  of  an  object  held  by  h  when  h  is  moved.  Eeoornirin"  this 
fact,  one  may  nov  associate  with  the  DET-LIST  anchor  (OBJECT  isat)  at, 
(PLACE  locationof)  an  additional  NULL  filter,  sayinr: 

( (OBJECT  X)  !  MIL) . 

j.  ~“ 

In  this  case,  every  time  (PLACE  locationof)  is  updated  for  any  ''lace 
p,  no  DET-LIST  interactions  will  take  place  with  (x  isat)  for  an” 
object  x.  This  is  done  in  U.DS  by  set  filter  ( C  SPILT'’,  ft )  command  below: 

OSrILTEfi : [ ( (OBJECT  X)  !  NIL) 


(PLACE  locationof)  (OBJECT  inat)l 


THE  P  L I  f  D  UAIT)  PROFLEM  . . .  T.  ilr.u  ...Deo.  197  C  Po  ~e  P.3 

Thus,  as  the  hand,  h,  is  oovec!  the  system  \r  i  L 1  undate  the  location  of 

h,  without  having  to  examine  interactions  with  any  of  the  locations  of 
objects  in  the  model  space.  The  frame  interactions  are  identified  in 
MDS  via  the  DET-LIST  mechanisms.  This  enables  fIDS  to  identify 

inconsistencies,  if  any,  in  an  updating  process.  The  filter  "eehnnism 
provides  a  way  of  controlling  the  combinatorial  explosion  that  may 
result  in  General,  in  frame  interactions  of  this  kind.  \’e  have  in  the 
FLIHD  HAL'D  domain  an  extreme  case  of  the  use  of  filter,  where  the 
filter  is  set  to  NIL.  In  General,  one  may  associate  a  varietv  of 

filters  to  selectively  control  the  frame  interactions, 

(3).  The  Solution  of  the  Elind  Hand  Problem 

In  the  statement  and  solution  of  the  F-LIL'D  HAIID  problem,  i;e  will 
see  below  how  MDS  uses  the  above  domain  definition.  In  ID?,  to  solve 
this  problem  it  is  not  even  necessary  to  define  separate  actions,  like 
'GO',  'MOVE',  'PICKUP',  etc.  The  following  single  transformation  rule 
is  enourh: 

( CTHI'D" :  r:0VS0FJ(X  Y  Z) 

([(OTJfiCT  X ) (PLACE  Y  Z)  (X  isat  Y)  (GOAL  (X  isat  Z))J 
[(SOME  HA HD  H)  (ASSERT  ( V  hold  in-  X)) 

(ASSERT  ( H  isat  ?)) 

(ASSERT  (rOT(f  holding  X)))])) 

m- 

here  the  first  line  defines  the  name  and  the  arguments  of  this 
transformation  rule,  1 1 0  V  F 0  ? J ( X  Y  7);  the  second  line  is  called  the 
'dimension'  of  the  transformation  rule  which  stater,  how  the  ifivcnts 
are  bound  and  what  the  <Toal  is;  the  rest  of  the  rule  is  the  body, 
which  sates  how  the  coal  is  accomplished.  This  rule  will  be  invoked, 


TP G  P L I ■ ' D  l' AMD  PROPLEM  .  T.  hnu  ...Dec.  197-  L’a'-e  24 

w  he  never  there  is  a  need  to  change  the  location  of  an  object.  The 
algorithm  for  chanrinm  is  simple:  f*et  some  hand  to  hold  the  object, 


change 

the 

locat ion  of 

t  he 

hand 

,  and  let  the 

hand 

stop  holding  the 

object . 

All 

the  necessary 

frame 

interactions 

that 

are  needed  to 

maintain 

the 

consistency 

of 

the 

model  space 

while 

executir"  those 

actions  are  automatically  inferred  frori  the  domain  definition. 

The  statement  of  the  problem  would  simply  be: 

((SOME  OPJb’CT  0)  (0  color  red)  (GOAL  (0  isat  there))) 

In  response  to  this  input,  the  DESIGNER  will  first  enruire  the 
MDS  data  base  for  the  current  value  of  each  instance  of  OBJECT.  If 
the  current  condition  satisfies  the  real,  then  would  return  'SUCCESS'. 
Let  us  assume  that  initially  the  model  snace  contains  the  followin'' 
objects  and  relations: 

OEJECTs:  objl,  obj2,  ....  objIO 

PLACEs:  here,  there 

H AH  D :  handl 

COLOR:  red 

RELATIONS:  (here  locationof  (objl,  obj2,  ...,  objC)) 

(there  locationof  ( o b j 7 ,  o  b  j  d  ,  ...,  o h j 1 0 ) ) 

((objl,  obj2,  ...,  objG)  color  red) 

Let  the  initial  value  of  (handl  isat)  is  ?. 

From  the  input  moal  statement,  the  system  first  finer  the:  set  of 
objects  such  that  the  color  of  each  element  is  re:d.  7he>n  it  checks 
the  location  of  each  object  in  the  set.  If  any  one  in  that  set  is  at 
place  ’there’,  as  mentioned  before  ’SUCCESS’  is  returned.  Otherwise 
an  ’invocation  pattern’  is  venerated.  The  ’invocation  pattern’  is 
used  by  i'.DS  to  invoke  transformation  rules  that  mimht  be  appropriate 


Tlir  FLIi’D  H  A  li D  PROPLEF 


T.  I'SU  .  .  .Deo.  197f>  Pn-a  25 


to  reach  the  roal.  In  our  case,  the  model  space  doe 3  not  satisfy  the 
roal.  fl  s  a  result  of  the  initial  examination  of  the  nodel  space  the 
system  would  have  identified  the  following  relevant  bindings  and 
eond i t ions : 

( OEJECT  0):  0  <-  ( 0 r EOF ( o b 1  ,  obj2,  ....  objC)) 

(PLACE  P):  P  <-  here 

Initial  Condition:  (here  locationof  (objl,  obj2,  ...,  obju)) 

Goal  Condition:  (PLACE  0):  0  <  -  there;  (C  locationof  0) 

The  generated  invocation  pattern  v:ould  be: 

((OPJECT  0) (PLACE  P  0)(P  locationof  0)(G0AL  (0  locationof  C))) 

Usin^  this  invocation  pattern  ?:DS  would  invoke  the  trrnsfor-at  ion 
rule,  f'OVEOPJ  defined  above.  The  bindinrs  shorn  above  rill  he  used  in 
the  execution  of  the  transformation  rule,  to  bind  the  local  variables 
of  the  rule.  (OI’EOF  (objl,  obj2,  ...»  obj6))  \'ill  cause  one  of  the 
indicated  objects  to  be  bound  to  X.  Let  X  <-  objl  be  the  initial 
choice;  Y  <-  here,  and  Z  <-  there.  The  predicate  •  (EC'S  rf.-'L'  )  '  in 
the  body  of  the  rule  trill  cause  a  hand  fron  the  model  space  tc  be 
selected.  Notice  that  there  could  be  more  than  one  hand  in  the  model 
space.  Then  an  arbitrary  choice  vri  1 1  be  made.  In  our  case,  of 


course , 

handl 

rill 

be 

chosen,  resulting  in  1. 

< -  handl. 

i  svin."  done 

lT  the 

bind  in 

"s  , 

the 

actions:  ’(ASSERT  ( h 

a  n  d  1  h  c  1 

-iin*  c  h  j  1  )  )  , 

( ASSENT 

(handl 

isat 

there)),  (ASS  E  ii  T  (EOT  (handl 

hoi d i n - 

objl)))'  '-'ill 

be  initiated  in  seouence.  The  assertion  of  '(hnr.dl  h  c  1  d  i  n  r  a  a  ,i  1  )  ' 
will  cause  CC2-i’.  A?’ C- ho  Id  in"  to  be  evaluated.  Since,  the  loco  tics  of 
handl  is  un’.cnorn ,  CC2  will  evaluate  to  ?,  and  the  residue  *  (objl 
isat : locat ionof  handl)'  will  be  returned.  This  rill  cause  the  r vs  ten 


to  make  the  hypothesis  '(here  locationof  handl)' 


» 


and  make  the 


THE  t’LII’D  H/U  n  PROt'LEt; 


T .  Hsu 


.Doc.  1976  H  a  r e  2  0 


assertion.  If  initially,  handl  was  'there'  then,  of  course,  CCP  vould 
have  evaluated  to  TIL  with  the  false  residue  '(objl  isat:locationof 
handl)'.  In  this  case,  if  more  hands  are  available  in  the  system, 
then  the  svsten  will  choose  another  hand.  However,  while  choosing 
another  hand,  h,  it  will  nal'e  sure  that  the  false  residue  '(objl 
isat : locationof  h)'  is  not  amain  violated.  Thus,  the  system  would 
have  already  learnt  fron  its  first  mistake  and  avoid  the  mistake  in 
subsequent  trials.  If  no  other  hand  is  available  then  the  above  false 
residue  may  be  used  to  set  up  a  new  s  u  b  m  o  a  1 ,  n  a  ~  e  v  (  D  C  /•  L  (here 
locationof  handl)).  In  our  case,  (handl  holdinr  objl)  will  succeed, 
causing  T HI- HA  HD-holding  to  set  (objl  isat:Dflar  1),  as  discussed 
before. 

It  should  also  be  noted  that  in  the  domain  definition,  '(I’Af'D 
holdinr  ODJECT)'  indicates  that  a  IIAt’D  can  hold  onlv  one  09JECV.  If 
handl  was  already  holdinr  an  object  b,  then  the  assertion  (handl 
holding  objl)  will  cause  the  system  to  remove  b  from  b e i r ~  held  bv 
handl,  and  introduce  objl  as  the  HEl’VAL  of  (handl  holdinr)  . 

The  remainin'-  assertion  in  f.OVEOPJ  vould  now  follow  and  co-nlrte 
the  realization  of  our  roal.  The  important  point  to  note  here  is 
that,  at  the  t  •’  m  e  the  problem  is  stated  or  at  the  time  the 
transformation  rule  is  defined,  it  is  not  necessarv  for  a  user  to  be 
aware  of  domain  constraints  and  frame  interactions. 


(4).  Discus sion 


TIJF.  PLII.'D  HAf’D  PROPL^:,!  .  T.  I!su  ...Dec.  1976  Par,;  p/j 

Pesides  the  soecial  facilities  for  controlinp  the  combinatorial 
explosion  of  the  frame  oroblom  mentioned  above,  the  H.DS  has  other 
distinguished  features  as  follows: 

1) .  The  set  concept  is  autonatically  built  into  the  ?")S  form  a  1  i  sr . 
There  are  two  layers  in  HDS  for  building  a  knowledge  base:  the  first 
one  is  the  domain  definition  layer  which  defines  the  syntax  and 
constraints  of  a  domain;  the  second  layer  is  the  instantiation  layer 
which  builds  the  model  space  by  instantiating  instances  and  relations. 
Durinm  the  instantiation,  the  system  will  automatically  check  the 
consistency  of  the  new  instance  or  relation  accorrfir.r  to  the 
definition  of  the  first  layer,  then  accept  it  or  reject  it.  i ' e n c p , 
each  instance  is  closely  related  with  its  defined  type,  called 
"template",  in  the  first  layer.  In  the  3-valued  logical  3vste<"  of  the 
MPS  model  space,  both  the  positive  and  negative  values  of  elementary 
reletions  may  be  stored.  Hence,  *:pS  can  efficiently  evaluate 
negations  of  predicates. 

2) .  The  inverse  relation  undatin'-  process  is  built  into  the  extern. 
In  General,  if  the  relation  (X  r  Y )  is  defined,  then  the  inverse 

relation  (Y  rof  X)  is  automatically  defined  by  the  system. 

v- 

3 )  •  Similar  to  the  idea  of  the  criticality  value  in  ArSTklPS  ,  in  PPS 
one  can  define  'focus  lists'.  The  focus  list  contains  nil  the 
imnortant  predicates  which  must  be  satisfied  first  while  m  a  k  i  n  ~  a  new 
assertion.  It  helps  the  system  find  the  correct  order  to  n roc  css  the 
control  when  many  predicates  must  be  satisfied  at  the  same  tir,e. 


Tl't  FLI  .i)  I'APD  P  R  0  P  L  F"  1 


T.  Hsu 


..Dec.  1976  P  a  ”  e  2  8 


4) .  There  is  no  "state"  variable  or  "tine"  parameter  in  fDo  or 
AF STRIPS.  Pence  it  is  very  difficult  to  describe  a  rule  which  depends 
on  tine;  like  our  initial  condition,  it  is  hard  to  say  that 
"initially,  all  the  objects  at  'here'  are  red". 

5) .  because  of  the  inolenentation  of  POf-LIST  and  PKT-LI3T,  although 
the  MDS  is  written  in  the  first  order  loyic,  it  really  has  some 
features  of  the  second  order  lcric.  When  each  particular  predicate  is 
instantiated  or  undated,  the  system  will  ro  through  the  related 
D0W-LI3T  and  DF.T-L1S?  checkiny  the  consistency.  notice  that  this 
procedure  which  checks  all  the  related  nredicates  in  the  model  soace, 
in  fact,  is  the  same  work  we  described  in  the  frame  rule  in  the  second 
order  1  o  ”  i  c  .  In  addition  the  "residue"  concept  in  i  ?  D  S  helps  maintain 
the  system  efficiently.  The  residue  is  that  subexpression  of  the  CC 
which  supplies  the  reason  why  the  predicate  evaluated  to  a  narticular 
truth  value.  hence  each  time  a  predicate  is  updated,  the  value  of 
each  residue  which  contains  that  predicate  is  examinee.  If  it  keeps 
the  sane  value,  then  it  implies  that  the  truth  value  of  the  eradicate 
to  which  the  residue  belongs  is  also  unchanged.  On  the  other  hand,  if 
the  truth  value  of  a  residue  is  chan red,  then  re-evaluation  of  the 
related  C C  is  necessary.  Another  distinguished  feature  of  residues  is 

*~the  learning  capability  in  a  problem  solving  context.  If  a  b  India ” 
had  generated  'true'  for  the  truth  value  of  the  true-run  i^'we ,  then 
next  time  the  same  binding  will  be  used.  Put  if  a  bin^in-  had 
Generated  'false'  for  the  truth  value  of  the  f a  1 se-res i d we  ,  then  the 
sa"e  bin  din”  will  not  be  used  a  rain.  In  this  way,  the  svste-’  learns 
how  to  bind  things  correctly  and  avoid  the  sane  wronr  bindin’  anin 


THe:  UU'Ij  HAND  PHOfLEH  .  T.  Hsu  ...Dec.  1976  Pa^e  29 

accordin’  to  the  previous  evaluation  of  a  residue. 

6).  Another  rood  feature  is  having  the  model  space  of  ‘*DC  rork  on  a 
three  valued  loric.  Hence  we  can  say  that  the  truth  value  of  a 
predicate  is  unknown,  which  is  very  useful  in  creatin'  the  model  or 
solving  the  problem. .  For  ex  an  ole,  in  the  blind  hand  problem,  a  lot  of 
missinr  informations  are  involved,  but  the  f.DS  can  still  solve  the 
problem  according  to  the  available  information.  ’.’hen  an  unknown 
residue  is  returned,  the  system  will  make  the  various  necessnrv  proper 
assertions  associated  with  that  residue.  Since  most  AT  problem 
solving  systems  only  deal  with  two  valued  lo<~ic,  when  the  unknown 
predicate  occurs,  it  will  assume  that  the  truth  value  of  that 
predicate  is  either  true  or  false  which  nay  cause  some  inconsistency 
in  the  model  later  on.  It  is  tedious  to  maintain  such  a  svstem. 
EsDecially  when  a  theorem  prover  is  used  in  the  sv stem,  the  svste*  -’ay 
derive  unexoected  wronp  results  usin'*  this  inconsistent  data. 

A  C  K  N  G  V  L  F  D  C  K  H  G  i’  T  S 

I  wish  to  express  my  gratitude  to  nrof.  Chitoor  V.  Srini varan  who 

**=fuTly  explained  the  various  features  in  fIDS  to  me  3nd  heloeri  mo  solve 
the  blind  hand  problem  in  i’DS .  moreover,  he  carefully  read  tl.n  draft 
and  corrected  many  mistakes  in  av  English.  I  also  wish  to  thank  •  •  r o f . 
liatesa  S.  Sridharan  who  su-'-ested  me  to  do  this  study  and  -a do  the 
valuable  comments  and  criticisms.  Finally,  many  thanks  are  due  to  .my 
<*ood  friend,  Frank  kawrusik  who  read  the  early  draft  earefullv  i n r! 
t  h  o  r  o  u  ~  h  1  v  ,  "ave  me  a  <*ood  deal  of  assistance  in  ny  writtinr. 

RFFG FENCE 

1 )  .  J  .  L  .  Dari  in**ton  :  "Deductive  Plan  Formation  in  H  i  mher-order 

Logic",  MI  7  ,  1973.  no. 129-137. 


TUF  PLIHD  H  A  HO  PROPLbF:  .  T.  Hsu  ...Dec.  1976  Pare  30 

2)  .  J .  McCarthy  £  P.J.  !!ayes:  "Sore  Philosophical  Problen  fron  the 

Standpoint  of  Artifial  Intellirence", 

MI  4,  1969. 

3) .S.D.Saeerdoti:  "Planning  in  a  Hierarchy  of  Abstraction  Space", 

A I  5,  1974  ,  do. 115-135. 

4 )  . M . S . Sr idharan :  "The  Architecture  of  Peliever:  Part  II.  The  Frame 

Problen",  CPM-TR-47,  DCS,  Putters  Univ.,  1576 

5) .C.V.Srinivasan:  "The  Architecture  of  Coherent  In  for nation 

Systen:  A.  General  Problem  Solving  System", 

IEEE  Transactions  on  Computers,  vol.  C-25, 
no. 4,  1976,  pn. 390-402. 


6  )  .  C  .  V . Sr inivasan :  "The  Model  Space  of  the  Meta  Description  System", 

SOSAP-TR-19,  DCS,  Rutaers  University, 

February  1976. 


SOSAP-TR-18 


January  1976 


INTRODUCTION  TO  THE  META  DESCRIPTION  SYSTEM 
C.  V.  Srinivasan 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

Hew  Brunswick,  New  Jersey 


This  research  was  partially  supported  by  the  Advanced  Research 
Projects  Agency  of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6 
to  the  Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U.  S;  Government. 


INTRODUCTION  TO  THE  META  DESCRIPTON  SYSTEM. 

by 

C.V.Srinivasan. 

KEY  WORDS:  KNOWLEDGE  REPRESENTATION,  PROLEM  SOLVING, 
DESCRIPTIVE  SYSTEMS. 

Abstract: 

In  this  paper  we  introduce  the  basic  concepts  of  a 
knowledge  based  system  called  the  Meta  Description  System 
(MDS) .  In  MDS  one  first  defines  the  language  to  be  used  for 
describing  the  knowledge  in  a  domain,  and  the  semantics  of  the 

language.  Based  on  this  definition  MDS  builds  for  itself  a 

.  * 

model  space  for  the  domain  and  uses  the  model  space  in  a 
variety  of  problem  solving  activities. 


INTRODUCTON  TO  THE  META  DESCRIPTION  SYSTEM  (*1) 

by 

C.V.Sr inivasan.  ( *2) 

1.  INTRODUCTION. 

The  "problem  of  representation"  in  AI  systems  arises 
because  of  the  ever  present  need  to  work  with  an  incomplete 
corpus  of  immediately  accessible  information.  In  small  well 
understood  domains  of  reasonable  complexity  it  is  often 
possible  to  arrive  at  a  "good"  decomposition  of  the  domain  of 
knowledge  into  simpler  parts  for  each  one  of  which  the 

necessary  information  and  its  associated  processing  facilities 
can  be- appropriately  packaged,  with  reasonable  assurance  that 
all  the  components  would  interact  harmoniously.  As  the 
complexity  of  the  domain  increases  it  becomes  necessary  to 
transfer  some  of  the  responsiblity  for  packaging  knowledge  to 
the  system  itself.  Here  we  face  enormous  difficulties.  We  do 

not  even  have  a  commonly  agreed  upon  view  of  what  the  issues 

* 

of  the  "problem  of  representation"  are.  At  one  extreme  we 
have  the  view  of  procedural  encapsulation  of  knowledge 

(Winograd  1972,  Hewitt  1973,  Minsky  1975].  At  the  other 

extreme  we  have  the  "purely  declarative"  approaches  associated 


INTRODUCTION  TO  MDS...,By  C.V.Sr inivasan,  Jan.  1976. Page  3 

with  the  use  of  general  deductive  systems,  where  there  is  only 
a  weak  notion  of  packaging;  the  necessary  information  in  a 
context  is  available  only  implicitly.  We  also  have  now  the 
middle  view:  We  need  both  procedural  encapsulation  and 
declarative  representations  of  knowledge;  we  should, 
therefore,  find  ways  for  wiring  in  both  in  some  common 
framework  of  a  problem  solving  control  structure  iWinograd 
1975] . 

In  the  organization  of  these  problem  solving  control 
structures  there  are  now  a  few  organizatonal  concepts  that  are 
here  to  stay.  These  should  have  a  role  to  play  in  the 
architecture  of  any  intelligent  system:  One  is  the  use  of 
"model  space"  and  model  based  reasoning,  the  other  is  the  use 
of  some  kind  of  general  deductive  facility,  and  the  third  is 
procedural  encapsulation.  The  current  procedural/declarative 
controversy  seems  to  be  centered  mainly  around  how  one  might 
wire  in  the  necessary  procedural  knowledge  and  deductive 

facilities  in  the  model  space  for  a  domain  of  knowledge.  We 

' 

believe  that  this  is  a  non-issue.  We  would  like:  to  present 
below  a  point  of  view  where  the  issues  of  "problem  of 
representation"  are  presented  as  issues  of  communication  among 
interacting  processes.  The  basis  for  this  shift  of  view  is 
briefly  the  following: 

The  control  structures  associated  with  the  model  space, 
with  the  deductive  mechanisms  and  with  the  procedure 


DUCTl0N  t0  MDS^ . . ,  By  C.V.Sr inivasan,  Jan.  1976. Page  4 

invocation  and  execution  can  all  be  made  independent  of  domain 
of  knowledge.  And,  what  is  more,  they  can  be  made  to 
specialize  themselves  by  sharing  their  experiences  with  a 
given  corpus  of  domain  knowledge.  This  domain  knowledge  might 
itself  have  been  described  in  a  language  of  the  domain, 
without  reference  to  the  control  structures  that  use  it.  In 
this  context  the  effectiveness  of  the  specialization  will 
depend  crucially  on  the  nature  of  the  communication  that  can 
take  place  among  the  interacting  process.  Our  thesis  is  that 
these  interacting  processes  should  have  available  to  them  the 
full  richness  of  the  language  of  the  domain  to  communicate 
among  themselves.  The  seeds  of  the  communication  problem  will 
then  lie  in  the  structure,  brevity,  effectiveness  and  focus 
achieved  in  this  communication.  These  would  depend  on  the 
primitives  available  in  the  language  of  the  domain  and  the 
concepts  expressible  in  the  language.  The  language  of  logic 
4s  too  general,  cumbersome  and  non-domain-specifc  to  be  useful 
here.  To  view  the  problem  in  this  manner  it  is  essential  that 

one  be  able  to  have  a  view  of  what  the  knowledge  in  a  domain 

.  / 

4s,  independent  of  the  model  space  and  the  control  structures 
that  use  it.  The  framework  of  the  Meta  Description  System 
fMDS)  encourages  the  development  of  this  view.  The  basic 
outlines  of  this  framework  are  introduced  here.  At  the  moment 
ve  are  still  unable  to  offer  a  well  reasoned  approach  that 
would  reduce  the  "problem  of  representation"  to  a  viable 
technical  problem.  But  we  believe  we  have  some  hopes  of 


INTRODUCTION  TO  MDS:..,By  C.V. Sr inivasan,  Jan.  1976. Page  5 

Achieving  this  end. 

The  central  concept  in  the  organization  of  the  Meta 
Description  System  (MDS)  is  this  seperation  that  is  achieved 
between  the  structure  and  semantics  of  domain  knowledge  on  the 
one  hand,  and  the  control  structures  of  the  model  space  and 
problem  solvers  on  the  other.  In  MDS  one  first  specifies  the 
structure  and  semantics  of  the  language  of  discourse  for  a 
domain.  This  is  the  DOMAIN  DEFINITION  specified  in  the  MDS 
formalism.  Based  on  the  domain  definition  MDS  builds  a  model 
space  [Sr inivasan  1976a]  for  the  domain.  This  model  space  is 
used  by  a  goal  directed  problem  solver,  called  DESIGNER,  as 
well  as  a  Theorem  Prover.  These  problem  solving  control 
structures  have  the  ability  to  specialize  themselves  to 
operate  efficiently  in  the  domain,  by  communicating  with  each 
other  in  the  language  of  the  domain.  The  nature  and 
effectiveness  of  this  communication  will  depend  on  the 
-primitives  available  in  the  domain  language.  In  the  context 
of  MDS  one  may  now  experiment  with  alternate  modes  of 
description  and  investigate  their  effects  on  .the  problem 
solving  efficiency. 

In  this  paper  we  shall  introduce  the  descriptive 
formalism  of  MDS  in  the  context  of  a  simple  domain:  The  domain 
of  MAZE  problems.  We  will  discuss  the  solution  of  the  maze 
problem  in  the  model  space,  and  in  the  context  of  the 
DESIGNER.  In  another  paper  (Sr inivasan  1976b]  we  show  how  the 


INTRODUCTION  TO  MDS..., By  C . V. Sr inivasan ,  Jan.  1976. Page  6 

sane  problem  is  solved  by  the  Theorem  Prover  in  MDS,  using  the 
very  same  definitions  and  the  model  space.  We  shall  not 

discuss  here  aspects  of  domain  specific  specialization.  The 
DESIGNER  has  a  limited  capacity  to  learn  and  generalize  from 
its  interactions  with  the  model  space  (Sr inivasan  1973, 
1975a] . 

2.  NATURE  OF  KNOWLEDGE  AND  PROCESSORS  IN  MDS. 

MDS  accepts  three  kinds  of  knowledge  — about  facts, 

objects,  processes,  and  problem  solving  in  a  domain.  The 
first  is  the  STRUCTURAL  knowledge.  This  pertains  to  the  forms 
of  descriptions  of  objects  in  the  domain.  The  second  is  the 
SENSE  knowledge.  This  pertains  to  the  semantics  associated 
with  the  structures.  Sense  knowledge  is  specified  as 
predicates  in  the  context  of  set  constructions.  The  third  is 
the  TRANSFORMATIONAL  knowledge.  This  pertains  to  the 
knowledge  necessary  to  create  new  objects  in  the  model  space 
of  the  domain,  and  specialized  knowledge  pertinent  to  the 

updating  processes  in  the  model  space.  These  rules  are  of  two 

types:  Those  that  are  directly  accessed  and  executed  by 
CHECKER  to  effect  well  specified  contingent  changes  in  the 
model  space,  and  those  that  are  invoked  on  the  basis  of  a 
pattern  directed  invocation. 

The  execution  of  all  the  processes  is  transparent  to  the 
system,  in  the  sense  that  the  different  components  of  the 


INTRODUCTION  TO  MDS. . .  ,By  C .V. Sr inivasan,  Jan.  1976. Page  7 

system  can  pass  information  to  each  other  about  what  they  are 
doing ,  unless  one  invokes  specially  built  in  opaque  functions 
to  do  specific  tasks.  The  STRUCTURE  and  SENSE  definitions  are 
used  by  the  CHECKER-INSTANTIATOR  system  to  create  and  maintain 
a  consistent  model  space.  The  transformation  rules  are  used 
by  DESIGNER  to  plan  and  execute  sequences  of  actions  that 
modify  the  model  space  to  reach  desired  objectives.  The  third 
control  structure  is  that  of  the  Theorem  Prover  [Srinivasan 
*  1976b] .  The  TP  is  used  to  aid  the  CHECKER-INSTANTIATOR,  and 

the  DESIGNER.  The  fourth  ccntrol  structure  is  that  of  the 
LINGUIST.  This  may  be  used  to  define  special  user  languages 
specific  to  a  domain.  The  language  understanding  process  is 
viewed  as  one  of  generating  the  appropriate  structures  in  the 
model  space  in  response  to  utterrances  in  the  language.  This 
understanding  process  may  implicitly  invoke  the  full  problem 
solving  power  of  the  system.  All  these  control  structures  are 
domain  independent.  They  specialize  themselves  to  the  domain 
based  on  the  domain  information. 

MDS  is  not  yet  fully  operational.  We  expect-  the'  model 
space  management  system  [Srinivasan  1976b]  to  be  working  in  a 


few  months 


INTRODUCTION  TO  MDS '. . . ,  By  C. V. Sr inivasan,  Jan.  1976. Page  8 

3.  THE  FORMS  OF  DOMAIN  DEFINITION.  THE  MAZE  PROBLEM. 

The  first  part  of  domain  definition  is  to  identify  the 
classes  of  objects  in  the  domain.  MAZE 'a  have  NODES ,  NODE, 
PATH,  MAZEPROBLEM,  etc.  The  description  structure  for  these 
are  specified  first.  This  is  shown  in  Table  I.  The 
definition  of  MAZE  says  the  following:  A  MAZE  is  a  NODE  (in 
contrast  to  being  a  LIST,  which  is  a  collection  of  objects  in 
the  domain) .  The  $N  flag  associated  with  the  MAZE  specifies 
this.  The  $  flag  indicates  that  instances  of  MAZE  need  not  be 
named  objects  in  the  model  space.  (For  a  full  discussion  of 
the  various  descriptive  facilities  in  MDS  see  Srinivasan 
1975).  A  MAZE  has  three  descriptive  relations  associated  with 
its  They  are  "startingnodes" ,  “exit",  and  "contains".  The 
starting  node  of  a  MAZE  is  an  instance  of  NODES,  which  is  a 
collection  of  instances  of  NODE.  The  exit  of  a  MAZE  is  a 
NODE.  This  indicates  that  there  is  precisely  one  exit  node, 
since  NODE  is  a  $N  class  of  object.  The  MAZE  contains  NODES 
(a  collection  of  nodes).  The  flags  CC1,  CC2  and  CC3  indicate 

that  certain  consistency  conditions  (sense  definitions,  we 

* 

.shall  use  the  two  names  interchangeably)  are  associated  with 
the  respective  relations.  We  shall  investigate  the  forms  of 
these  later.  Similarly,  we  have  the  NODES,  NODE,  PATH,  and 
MAZEPROBLEM  schemas.  Schemas  of  this  kind  are  called 
TEMPLATES  in  MDS.  The  flag  $L  of  NODES  indicate  that  it  is  a 
LIST  template. 


INTRODUCTION  TO  MDS:..,By  C.V.Sr inivasan,  Jan.  1976. Page 


TABLE  I:  THE  DEFINITION  OF  THE  MAZE  DOMAIN. 


((MAZE  $N) 


CC1: 

CC2: 


CC3: 


((NODES  $L) 


((NODE  $N) 


CC4: 

CC5: 


[(PATH  $N) 


CC6:  (PATH  tail]. 

[ (PATH  P) I (@  tail  P) 

((P  is  NIL)  V 
((SOME  NODE  N) 

(P  startingnode  N) 

(@  startingnode: is:connectedto  N) ) ] . 
CC7:  (PATH  endingnode] .  / 

((NODE  X)|((@  tail  NIL) <-*> (@  startingnode  X)) 

((ALL  PATH  P) 

(@  tail  P)->(P  endingnode  X)]. 


startingnodes  (NODES  $L)  startingnodesof  CC1) 
exit  (NODE  $N)  exitof  CC2) 
contains  (NODES  $L)  belongto  CC3) ] . 

MAZE  startingnodes] . 

((NODE  X) |  (@  startingnode  X)  (X  connectedfrom  NIL)] 
MAZE  exit] . 

(NODE  X) |  (@  exit  X)  (X  connectedto  NIL)]. 

MAZE  contains] 

(NODE  X) | (@  startingnode  X)  V  (@  exit  X)  V 
((SOME  NODE  Y) (@  contains  Y) 

(Y  canreach  X) ) ] . 

eleradn  (0  *  NODE)) 

(connectedfrom  V)  (NODE  $N)  connectedto) 

(belongto  V)  (MAZE  $N)  contains) ] . 

connectedto  (NODES  $L)  connectedfrom  CC4) 

(canreach  $X)  (NODES  $L)  canbereachedf rom  CC5)]. 

NODE  connectedto] . 

(NODE  X) | (@  connectedto  X) (NOT(X  is  §))]. 

NODE  canreach] . 

(NODE  X) | (@  connectedto  X) ] . 

startingnode  (NODE  $N)  startingnodeof ) 
tail  (PATH  $N)  tailof  CC6) 

(endingnode  $)  (NODE  $N)  endingnodeof  CC7) ] . 


INTRODUCTION  TO  MDS...,By  C. V. Sr inivasan,  Jan.  1976. Page  10 

Every  instance  of  NODES  is  a  collection  of  the  form  (nl  n2  ... 
nk)  where  each  n  is  an  instance  of  NODE.  This  is  indicated  by 
the  form  " (elemdn  ,(0  *  NODE))"  in  the  definiton  of  the  NODES. 
0 ,  *  indicates  that  the  lower  bound  on  the  number  of  elements 
in  any  instance  of  NODES  is  0,  and  the  upper  bound  is 
unlimited.  Thus,  for  an  instance  of  MAZE,  say  m,  its 
startingnodes  might  be  (nl  n2  ...  nj).  This  will  appear  in 
the  model  space  of  MDS  as  an  assertion  of  the  form  (m 
startingnodes  (nl  n2  ...  nj)).  In  the  model  space  this  is 
interpreted  as:  (m  startingnode  nl) ,  (m  startingnode  n2) ,  ..., 
and  (m  startingnode  nj)  (we  shall  use  singular  and  plural 
forms  of  relations  interchangeably  as  is  convenient] . 

For  every  relation  used  in  a  template,  the  template  also 
specifies  its  inverse  relation.  Thus,  the  inverse  of 
■contains"  is  "belongsto",  and  the  inverse  of  "exit"  is 
"exitof":  For  a  MAZE  m,  if  (m  exit  n)  is  true  for  a  node,  n, 
then  (n  exitof  m)  is  also  true  in  the  model  space,  and  vice 
versa. 

r 

Each  consistency  condition  is  of  the  form 
( (<template>  X) I  (P  @  X)) , 

where  (P  §  X)  is  called  the  "predicate"  of  the  CC,  X  is  called 
the  "set  variable"  of  the  CC  and  @  is  called  the  "current 
instance"  of  the  CC.  It  is  the  instance,  at  which  the  CC  is 
being  evaluated  in  the  model  space.  The  CC  may  be  read 
uniformly  as:  "The  collection  of  all  instances,  X,  of 


INTRODUCTION  TO  MDS 1 . . , By  C.V.Sr inivasan,  Jan.  1976. Page  11 

<template>  such  that  (P  §  X)  is  true."  Thus,  CC1  says  that  the 
startingnodes  of  a  MAZE  is  the  collection  of  all  nodes  such 
that,  if  X  is  given  as  the  startingnode,  ("(@  startingnode  X) " 
appearing  in  CC1  is  to  be  read  in  this  manner),  then  (X 
connectedf rom  NIL)  is  true.  We  shall  explain  this  convention 
further  below.  (X  connected from  NIL)  means  that  there  is  no 
node  from  which  X  is  connectedto  (connectedto  and 
connectedfrom  are  inverses  of  each  other) .  If  the  necessary 
and  sufficent  conditions  are  known  for  the  defintion  of  a 
relation,  then  the  predicate  of  the  CC,  (P  §  X) ,  will  be  such 
that,  for  a  relation,  r,  at  which  the  CC  is  defined, 

(§  r  y)<->(P  @  y). 

However,  if  only  the  necessary  condition  is  known  for  a 
relation  to  be  true,  then  one  would  have  a  predicate,  Q,  such 
that, 

(«  r  y)-> (Q  @  y) 

In  this  case  we  modify  the  predicate  of  the  CC  as  shown  in 
below: 

(§  r  y)  <->(@  r  y)(Q  §  y). 

Thus,  for  ALL  CC's  the  if  and  only  if  condition  is  true. 

All  quantifications  appearing  in  CC's  will  range  only 
over  specified  classes  of  objects  in  the  domain.  Thus,  we 
write  in  CC7  ((ALL  PATH  P) (@  tail  P)->  (P  endingnode  X)),  to 
indicate  ((ALL  P) (P  instanceof  PATH) &  (@  tail  P)->  (P 

endingnode  X)).  Generally,  between  adjacent  predicates  of  the 
form  (x  r  y) (p  rl  q)  implicit  6  is  assumed.  In  general,  ((ALL 


INTRODUCTION  TO  MDS. ..,By  C . V. Sr inivasan,  Jan.  1976. Page  12 

X  x)P(x))  is  interpreted  as  ({ALL  x) (x  instanceof  X)->  P(x)). 

Similarly,  ((SOME  X  x)P(x))  is  interpreted  as  ((SOME  X  x) (x 
instanceof  x)&P(x)).  CC6  thus  says  that  for  a  given  PATH,  p, 

4 

P  can  be  the  tail  of  a  PATH  if  P  is  NIL  or  for  some  NODE,  N, 
it  is  true  that  (P  startingnode  N)  and  @  startingnode  is 
.connectedto  N.  The  phrase  "(@  tail  P) "  indicates  that  P 
Should  be  specified  by  an  external  agent.  (P  is  NIL)  is 
interpreted  as  "P  is  identically  equal  to  NIL. "  CC3  defines 
the  nodes  contained  by  a  MAZE  inductively  in  terms  of  the 
start ingnodes  of  a  MAZE.  The  flag  $X  is  associated  with  the 
relation  "canreach"  in  the  NODE  template.  By  convention,  X 
here  indicates  that  the  relation  is  transitive;  the  $ 
indicates  that  the  value  of  this  relation  is  never  stored  in 
the  model  space.  Every  time  it  is  called  for  it  is  computed 
using  the  CC,  CC5.  CC5  specifies  that  the  instance  of  NODES 
reached  by  a  NODE  is  precisely  the  same  as  the  NODES  to  which 
it  is  connectedto.  However,  since  canreach  has  been  declared 
:to  be  transitive  the  model  space  will  return  always  the 

transitive  closure  of  the  relation.  Notice  that  a  PATH  has 

,  / 

been  defined  to  be  anything  starting  with  a  node, : containing  a 
tail  which  is  itself  a  PATH.  The  endingnode  of  a  PATH  is  not 
stored  in  the  model  space.  CC7  specifies  that  the  ending  node 
is  the  same  as  the  starting  node  iff  the  tail  of  the  PATH  is 
NIL,  else  the  endingnode  is  the  same  as  the  endingnode  of  the 


INTRODUCTION  TO  MDS:..,By  C.V.Sr inivasan,  Jan.  1976. Page  13 

Even  though  the  value  returned  by  the  CC's  (Sense 
definitions)  is  always  viewed  as  collections,  the  model  space 
would  give  the  returned  value  the  proper  interpretation  based 
on  the  template  definition  associated  with  the  CC.  Thus,  in 
case  of  the  the  endingnode  of  a  PATH,  from  the  template 
definition  it  is  known  that  the  ending  has  got  to  be  an  unique 
NODE. 


The  template  for  MAZEPROBLEM  is  defined  in  Table  II. 


TABLE  II:  THE  MAZEPROBLEM. 


[(MAZEPROBLEM  $N) (start ingnode  (NODE  $N)  startingnodeof  CC8) 

(solution  (MAZEPROBLEM  $N)  solutionof  CC9 


TR9) 


(maze  (MAZE  $N)  mazeof) ) . 


CC8: [MAZEPROBLEM  start ingnode) 

[(NODE  X) | (@  startingnode  X) 

(@  mazescontains  X) ) . 

CC9s [MAZEPROBLEM  solution] 

[ (MAZEPROBLEM  MP) |  ( (@  startingnode: is:exitof :mazeof  @) 

(MP  is  NIL)  V 
((SOME  NODE  N) 

(N  canreach :exitof :mazeof :  §) 

(§  startingnode: isrconnectedto  N) 
(MP  startingnode  N)  ] . 

TR9: [MAZEPROBLEM  solution] 

[ (IFUNKNOWN  (((SOME  NODE  n) 

(6  startingnode: is:connectedto; n) 

(n  canreach:exitof imazeof  §)) 

(BIND  MP  (CREATE  MAZEPROBLEM 
(startingnode  n) 

(maze  (@  maze) ) ) ) 

(ASSERT  (@  solution  MP) ) 

(ASSERT  (MP  solution] . 


The  MDS  model  space  works  in  three  valued  logic,  T,  ? 
(unknown) ,  and  NIL,  T  >?  >NIL.  The  CHECKER  is  used  to 
evaluate  CC's  in  the  model  space.  CHECKER  has  no  authority  to 


INTRODUCTION  TO  MDS...,By  C. V. Sr inivasan,  Jan.  1976. Page  14 

change  the  model  space  while  evaluating  the  CC's.  It  can  only 
poll  and  check,  giving  the  proper  interpretation  for  the 
quantifiers.  In  jthe  case  of  the  solution  of  the  MAZEPROBLEM 
we  have  both  a  CC  and  a  TR  (transformation  rule) .  TR9  will  be 

invoked  by  the  CHECKER  of  the  model  space  if  CC9  evaluated  to 

UNKNOWN.  This  would  be  the  case  if  no  appropriate  instances 
of  MAZEPROBLEM  are  available  in  the  model  space.  By 

convention  TR9  will  have  three  arguments  available  to  it:  The 
current  instance,  @,  at  which  the  associated  CC  was  evaluated; 
the  set  variable,  MP,  of  the  CC;  and  the  so  called  RESIDUES, 
if  any  (see  Srinivasan  1976a  for  a  definition  of  residues). 
Residues  are  predicates  describing  the  reasons  for  the 

success,  failure  or  the  unknown  value  of  the  associated  CC. 

i 

They  will  always  be  sub-expressions  of  the  CC,  with  specific 
bindings  for  the  bound  variables  of  the  CC.  TR9  looks  for 
some  node,  n,  such  that  @  startingnode  is  connected  to  n,  and 
n  canreach  the  exit  of  the  maze  of  the  given  mazeproblem.  If 
it  succeeds,  then  it  creates  a  new  instance  of  MAZEPROBLEM  and 

assigns  this  as  the  solution  of  @.  It  then  goes  ahead  and 

,  / 

further  asserts  that  the  solution  of  the  new  mazeproblem 
should  now  be  found.  In  this  manner,  if  a  MAZEPROBLEM  is 
started  off  with  a  startingnode,  and  an  assertion  is  made  to 
find  its  solution,  the  CHECKER  control  structure  can  by  itself 
find  the  solution. 

Another  approach  for  solving  a  maze  would  be  by  the  use 
Of  the  DESIGNER.  The  DESIGNER  transformation  rule  for  solving 


INTRODUCTION  TO  MDS.'.., By  C.V.Sr inivasan,  Jan.  1976. Page  15 

a  naze  is  shown  Table  III.  The  above  rule  will  be  invoked  if 
one  makes  the  assertion,  for  some  known  node,  n,  in  the  maze 
currently  in  the  model  space, 

((SOME  PATH  P) (P  startingnode  n) (GOAL  (P  endingnode  e) ] . 

TABLE  III:  THE  SOLVEMAZE  RULE. 

SOLVEMAZE(m  n  e] . 

(((HAZE  n) (NODE  n  e) (m  contains  n) (m  exit  e) 

(SOME  PATH  P) 

(GOAL  (P  startingnode  n) (P  endingnode  e) ) ) 

This  is  the  Header  for  the  Rule.  The  rule  is  invoked  by 
pattern  matching  with  this  header. 

(BIND  P  (CREATE  PATH  (startngnode  n) ) ) 

(DCOND  ( (n  is  e) (ASSERT  (P  tail  NIL))) 

( (n  canreach  e) 

( (SOME  NODE  D) (n  connectedto  D) 

(D  canreach  e) 

(ASSERT  (P  tail  (SOLVEMAZE  m  D  e] . 

4.  CONCLUDING  REMARKS. 

•The  basic  elements  of  the  descriptive  formalism  of  MDS 
vere  introduced  here  in  the  context  of  a  very  simple  example. 
A  detailed  discussion  of  the  language  of  Transformation-  Rules 
mnd  the  DESIGNER  processes  appears  elsewhere  [Srinivasan 
1975a] .  An  interesting  aspect  of  the  organization  of  the  MDS 
nodel  space  [Srinivasan  1976a]  is  that  the  same  model  space  is 
used  by  a  variety  of  problem  solving  control  structures: 
DESIGNER  and  the  Theorem  Prover  [Srinivasan  1976b].  This  is 
made  possible  because  of  the  way  the  problem  solving  control 
structures  communicate  with  the  model  space  [Srinivasan 


INTRODUCTION  TO  MDS , By  C . V. Sr  inivasan,  Jan.  1976. Page  16 


1976a] . 


5.  ACKNOWLEDGEMENTS. 


During  the  course  of  the  development  of  the  concepts  in 
MDS  I  have  had  several  discussion  with  many  of  my  colleagues. 
Discussions  with  Dr.  Bertram  Bruce,  and  Dr.  Robert  Balzer 
were  particularly  valuable.  Prof.  Sridharan  participated  in 
my  class  on  Knowledge  Based  systems.  Explaining  the  modeling 
concepts  of  MDS  to  Prof.  Sridharan  and  describing  parts  of 
BELIEVER  (a  psychological  modelling  system)  in  the  MDS 
formalism  had  been  a  particularly  valuable  experience.  My 
students  John  Ng,  Joel  Irwin  and  Tau  Hsu  are  all  involved  in 
the  implementation  of  MDS.  I  am  thankful  to  Prof.  Adrian 
Walker,  Ng  and  Hsu  for  carefully  reading  through  this 
manuscript  and  for  constructive  suggestions  for  revisions. 

6.  REFERENCES: 

Hewitt,  C,  Bishop  et  al:[1973]:  An  universal  modular  ACTOR 
formalism  for  artificial  intellgence.  Proc.  of 

IJCAI3,  1973,  235-245. 

Minsky,  M: 11975] :  A  framework  for  representing  knowledge.  In 
Winston,  P.  (Ed) ,  The  psychology  of  computer 

vision.  New  York.  McGraw  Hill 
Winograd,  T:[1972]:  Understanding  natural  language.  New  York, 
Academic  Press. 

Winograd,  T: [1975] :  Frame  representations  and  the 

declarative/procedural  controversy,  in 

Representation  and  Understanding,  Bobrow&Collins 
(Ed) ,  Academic  Press. 

Srinivasan,  C.v.  [1973] :  "The  Architecture  of  Coherent 
Information  Systems:  A  general  Problem  Solving 
System",  in  3IJCAI,  Stanford  1973.  A  revised 
version  of  this  paper  appears  in  IEEE  Transactions 
on  Computers,  Special  issue  on  :  Artificial 
Intelligence,  April  1976. 

[1975a]:  The  Meta  Description  System,  RUCBM-TR-50, 
Department  of  Computer  Science. 

[1975b]:  A  formalism  to  define  the  structure  of 
knowledge,  RUCBM-TR-51,  Department  of  Computet 
Science,  Rutgers  University,  New  Brunswick,  N.J. 
[1976a]:  "The  model  space  of  the  meta  descriptio 
system" department  of  Computer  Science  Report, 
SOSAP-TR-19 ,  Rutgers  University. 

[1976b] :  "Theorem  Proving  in  the  meta  description 
system",  Department  of  Computer  Science  technical 
report,  SOSAP-TR-20,  Rutgers  University. 

Irwin,  J.  *...:  [1975b] :  The  description  of  CASNET  in  MDS., 
RUCBM-TR-49,  Department  of  Computer  Science. 


S0SAP-TR-20 


January  1976 


THEOREM  PROVING  IN  THE  META  DESCRIPTION  SYSTEM 
C.  V.  Srinivasan 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

New  Brunswick,  New  Jersey 


This  research  was  partially  supported  by  the  Advanced  Research 
Projects  Agency  of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6 
to  the  Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advancec 
Research  Projects  Agency  or  the  U.  S.  Government. 


THEOREM  PROVING  IN  THE  META  DESCRIPTION  SYSTEM. 

fay 

C.V.Srinivasan. 

KEY  WORDS:  THEOREM  PROVING ,  MODEL  SPACE,  GENTZEN's 
SYSTEM. 

Abstract: 

In  this  paper  we  introduce  a  way  of  using  the  natural 
deduction  system  of  Gentzen  to  do  theorem  proving  in  the 
context  of  a  model  space.  The  Theorem  Prover  actively  uses 

the  model  SpaCe  to  test  and  generate  hypothesis  and  guide 
itself.  The  model  space  itself  is  defined  in  the  context  of 
the  descriptive  formalism  of  the  Meta  Description  Systeifi. 


THEOREM  PROVING  IN  THE  META  DESCRIPTION  SYSTEM.  (*1) 

by 

C.V.Sr inivasan  .  (*2) 

4 

1.  INTRODUCTION. 

The  purpose  of  this  paper  is  to  introduce  the  basic 
concepts  used  in  the  organization  of  the  Theorem  Prover  (TP) 
in  the  Meta  Description  System  (MDS)  [Srinivasan  1973, 
1975a, b,  1976a, b] .  We  follow  essentially  Beth's  Semantic 
Tableaux  [Beth  1959]  approach.  The  TP  seeks  to  construct  a 
counter  example,  and  in  doing  this  it  makes  use  of.  the  model 
space  of  MDS,  not  only  to  test  and  generate  hypotheses,  but 
also  to  keep  track  of  the  various  cases  encountered  in  the 
theorem  proving  process  and  to  actually  build  the 
representations  for  the  counterexample  or  solutions,  as  the 
case  may  be.  Our  emphasis  in  this  paper  is  on  the  kinds  of 
interaction  and  communication  that  take  place  between  the 
theorem  proving  control  structure  and  the  model  space  of  MDS. 

We  shall  attempt  to  exhibit  this  interaction  in  the  context  of 

✓ 

a  simple  example:  the  solution  of  maze  problems.’  We  assume 
that  the  reader  is  familiar  with  the  concepts  and  organization 


*1.  This  work  was  supported  by  grants  from  both  the  National 
Institute  of  Health  (grant  number  RR-643) ,  and  the  Advanced 
Research  Projects  Agency  (Grant  number  DAHCIS-73-G6) ,  of  the 
Government  of  the  United  States  of  America. 

*2  Department  of  Computer  Scence,  Hill  Center,  Rutgers 
University,  New  Brunswick,  N.J.  08903. 


THEOREM  PROVING  IN  MDS. . . C .V. Sr inivasan ,  Jan.  1976. Page  3 

of  the  model  space  in  MDS  (Sr inivasan  1976b].  We  shall  also 
assume  that  the  reader  is  familiar  with  the  Gentzen's  [Ranger 
1963] system  of  logic  and  its  use  in  Beth's  ..Semantic  Tableaux 
approach  to  Theorem  Proving.  For  the  definition  of  the 
problem  domain  we  shall  use  the  MAZE  DEFINITION  introduced  in 
(Srinivasan  1976a].  In  [Srinivasan  1976c]  we  shall  discuss 
possible  application  of  the  MDS  theorem  prover,  to  prove 
Cantor's  theorem  in  Set  Theory. 

We  believe,  the  most  significant  aspect  of  the  TP 
organization  introduced  here  is  the  seperation  that  is 
achieved  between  the  model  space  for  a  domain  and  its 
associated  control  structures,  and  the  control  structure  of 
the  Theorem  Prover  itself.  The  two  control  structures 
communicate  with  each  other  in  the  language  defined  for  the 
domain,  within  the  descriptive  formalism  of  MDS. 

The  rules  that  govern  the  operation  of  the  Theorem  Prover 
are  summarized  in  Tables  Ila-c.  The  interpretation  of  these 
.rules  in  the  context  of  the  TP  control  structure  is  described 
in  Table  V.  The  application  of  these  rules  will  become  clear 

x 

in  the  discussion  of  the  example. 

2.  BASIC  DEFINITIONS  AND  CONVENTIONS. 

The  basic  assumptions  and  conventions  of  the  formalism  of 
SEQUENTS  and  the  concept  of  a  Theorem  Proving  State,  TP-State, 
are  explained  in  Table  I.  The  TP  rules,  shown  in  Tables 


TMbUKbM  PROVING  ZM  MDS. . .C.V^Sr inivasan,  Jan.  1976.Page  4 


lla-c,  specify  transformations  on  the  sequents.  Each  rule  of 
transformation  has  two  parts,  one  above  the  line  and  the  other 
below  the  line.  The  rule  specifies  that_in  any  TP-state,  a 
sequent  with  the  form  shown  below  the  line  may  be  replaced  by 
sequent (s)  of  the  form  shown  above  the  line. 


TABLE  I:  THE  SEQUENTS  AND  THE  THEOREM  PROVING  STATE. 

SEQUENTS:  Each  sequent  is  a  string  of  the  form 

(PI) (P2) ... (Pk)  *>  (Ql) (Q2) ... (Qm) , 

where  P,  Q  are  arbitrary  predicates  (in  mini-scope 
form).  The  string  (PI) (P2) . . . (Pk) ,  may  be 
interpreted  as  a  conjunction  of  the  predicates, 
(PI),  (P2) ,  ...  and  (Pk) .  The  string 

(Ql) (Q2) . . . (Qm) ,  may  be  interpreted  as  a 
disjunction  of  the.  predicates,  (Ql) ,  (Q2) ,  ... 
and  (Qm) .  We  shall  use  the  symbol,  S,  to  denote 
arbitrary,  possibly  null,  sequences  of  predicates 
of  this  form.  Thus,  the  general  form  of  sequent 
is:  "SI  «>  S2". 

THEOREM  PROVING  STATE:  The  TP-STATE  is  a  set  of 
sequents.  We  shall  use  to  seperate  the 

members  of  this  set,  and  typically  represent  a 
TP-state  by  a  string  of  the  form: 
SEQ1 ; SEQ2 ; . . . ; SEQk ,  where  each  “SEQ"  is  a  sequent. 
The  TP-state  "*>  S"  asserts  that  S  is  a  theorem. 
The  TP-state  "S  =>"  asserts  that  "S  (the  negation 
of  S)  is  a  theorem. 


There  are  three  kinds  of  rules:  PROPOSITIONAL  rules, 
SUBSTITUTION  rules,  and  QUANTIFIER  rules.  The  propositional 
rules  are  derived  from  tautologies.  Each  rule  has  a  label  of 
the  form  "(*>  c)",  or  * (c  ■>)"  associated  with  it.  It  is 
convenient  to  think  of  a  rule  (■>  c) ,  as  a  rule  for  the 
elimination  of  the  symbol,  c,  (which  could  be  a  connective,  or 
a  quantifier  symbol)  from  the  right  hand  side  of  a  sequent. 


THEOREM  PROVING  IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page  5 

Similarly,  a  rule  of  the  form  (c  =>)  is  used  to  eliminate,  c. 


from  the  left 

hand 

side  of  a  sequent. 

The 

order 

of 

application 

of 

these 

rules  is  shown  ifi 

Table 

III. 

The 

SUBSTITUTION 

rules  and 

the  QUANTIFIER  rules 

shown 

here 

are 

slightly  different  from  the  ususal  ones  (See  Ranger  1963) .  In 
the  case  of  a  substitution,  the  predicate  (x*y)  is  not 
maintained  in  the  seguents  of  the  TP-State,  after  the 
substitution  [See  Table  lib] .  In  the  case  of  the  quantified 
expressions,  the  expressions  themselves  are  not  maintained  in 
the  sequents  of  the  TP-state  after  substituting  for  the  bound 
variables  by  their  respective  Eigen  Variables  (or  Eigen  Terms) 
[See  Table  lie] .  These  differences  from  the  usual  forms  of 
these  rules  arise  because  of  the  way  the  rules  are  used,  in 
the  context  of  the  MDS  model  space.  The  equality  predicate 
gets  incorporated  into  the  model  space  and  thus  need  not  be 
maintained  in  the  TP-state.  In  the  case  of  the  quantified 
expressions,  the  expressions  themselves  are  always  available 
in  the  model  space,  and  are  accessed  when  needed.  Thus,  they 
need  not  be  maintained  in  the  TP-state.  At  any  instance  the 

TP-state  represents  only  a  partial  state  of  the  theorem 

.  * 

proving  process.  The  total  state  of  the  process  will  include 
both  the  TP-state  and  the  state  of  the  entire  model  space. 
The  ever  present  model  space  is  always,  implicitly  on  the  left 
hand  side  of  every  sequent  in  the  TP-state.  These 
considerations  will  become  clear  in  the  discussion  of  the 
example. 


.THEOREM  PROVING  IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page 


TABLE  II:  RULES  OF  TRANSFORMATION  OF  SEQUENTS. 

(a) .  PROPOSITIONAL  RULES: 

(■>  “):  SI  ->  S2 (P) S3  (“  ->):  S1(Q*S2  »>  S3 

S1(~P}S2  *>  S3  SI  ->  S2 (~Q) S3 

(->  6):  SI  »>  S2 (Ql) S3;S1  =>  S2(Q2)S3 
SI  «>  S2 ( (Ql) & (Q2) ) S3 
(V  -»:  Si  (PI)  S2  =>  S3;S1  ( P  2)  S2  »>  S3 
S1((P1)V{P2))S2  *>  S3 
(->  -»:  SI  (Ql)  =>S2  (Q2)  S3 

SI  ->  S2((Q1)->(Q2))S3 
(->  «>):  (P2)S1  =>  S2;S1  =>  (P1)S2 

( (PI) —> (P2) ) SI  »>  S2 

(<->  ■>):  S1(P1)(P2)S2  »>  S3;S1S2  «>  (P1)(P2)S3 

SI ( (Pi) <-> (P2) ) S2  =>  S3 

(->  <-»  :  Si  (Ql)  «>  S2  (Q2)  S3;S1  (Q2)  «>  S2(Q1)S3. 

SI  «>  S2(  (Q1X-XQ2)  )S3 

(b) .  SUBSTITUTION  RULES. 

(»  ->) :  SI [x:y] S2 [x:y]  *>  S3Ix:y] 

Sl(x»y)S2  *>  S3 

The  notation  "SIx^]"  is  to  be  read  as:  "All  occurrences  of  x 
in  S  are  substituted  by  y."  We  have  the  axiom,  "=>  .Sl(x=*x)S2". 
We  also  have  the  variant  of  the  above  (=  *>)  rule,  where 

* (y*x) "  occurs  in  the  sequent  below  the  line,  instead  of 
■(** y)"#  shown  above. 

(C).  QUANTIFIER  RULES. 

MOTE:  ((SOME  Tl  x)P(x))  is  logically  equivalent  to 

( (THERE-EXISTS  x)  (Tl  instance  x)&P(x)).  Similarly,  ((ALL  Tl 
x)P(x)),  or  simply,  ( (Tl  x)P(x))  is  logically  equivalent  to 
((ALL  x) (Tl  instance  x)->P(x)). 


I 


-  -«r«*r  w*.  *  •— 


THEOREM  PROVING  IN  MDS...C.V.Srinivasan>  Jan.  1976. Page  7 


Existential  Generalization. 

(->  SOME) :  SI  ->  S2(QTx:z] )S3  ^ 

"  *"  81"  ->  S2  ( (SOME  Tl  x)Q(x)-)S3 

Universal  Generalization. 

(ALL  «>):  SI (P [x : z] ) S2  =>  S3 


-^_l_UALL  Tl  _x)  P  (X)  )S2  =*>  S3 

The  variable,  z,  should  be  a  NEW  variable,  not  used  earlier  in 
the  TP-process.  We  shall  call  it  EIGEN  TERM  of  type  Tl.  Its 
potential  values  range  over  all  the  instances  of  Tl  and  eigen 
variables  of  type  Til  In  effect",  z  "has  the  status  of  a  LISP 
unbound  variable  whose  potential  range  of  values  are  known. 
If  (zl  z2  ...  zk)  are  the  potential  values  of  z,  then  make 
note  that  z  *  (ONEOF  (zl  z2  ...  zk) ) . 

Existential  Instantiation. 

(SOME  «>)j^__^l_(Pjx:q])S2  *>_S3 

SI ((SOME  Tl  x)P(x))S2  «>  S3 

Universal  Instantiation.  * 


(«>  ALL):  SI  «>  S2 (Q [x:q] ) S3 


-  -  -  SI  ->  S2((ALL  Tl  x)Q(x))S3 

Here  q  is  a  new  instance  of  Tl.  We  shall  call  it  an  EIGEN 
VARIABLE  of  type  TlV  q  can  be  potentially  equal  to  any  one  of 
the  Eigen  Terms  of  type  Tl  or  eigen  variables  of  type  Tl 
.created  in  the  TP-process,  prior  to  the  creation  of  q. 


3.  -THE  CONTROL  STRUCTURE. 


The  control  structure  for  the  theorem  prover,  is  shown  in 
Table  III.  This  table  is  best  understood  in  the  context  of 
the  example  discussed  in  the  next  section.  The  significant 
aspects  of  the  table  are  discussed  below. 


All  the  applicable  propositional,  equality  and  quantifier 


THEOREM  PROVING  IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page  8 

rules  are  applied  first.  As  a  result  of  this  process  the 
sequents  in  the  TP-state  might  have  several  occurrences  of 
elementary  predicates  of  the  form  (x  r  y)  (*3) .  In  each 
predicate  of  the  form  (x  r  y) ,  x  and  y  might  be  either  eigen 
variables  (i.e.  constants  in  the  model  space) ,  or  eigen  terms 
(variables  in  the  model  space  with  possibly  specified,  ranges 
of  potential  values) .  These  elementary  predicates  are 
asserted  into  the  model  space,  using  the  THASSERT  function,  as 
described  in  step  2  of  Table  III. 

In  each  THASSERT  the  model  space  will  be  used  to  actively 
seek  for  possible  assignments  of  values  for  the  eigen  terms, 
that  result  in  a  contradiction.  We  have  two  kinds  of 
contradiction:  The  WEAK  contradiction  is  a  contradiction  with 
respect  to  the  model  space.  In  this  case  a  predicate  in  the 
THASSERT  could  not  be  accepted  by  the  model  space  for  any  of 
the  possible  assignments  of  values  to  its  eigen  terms.  The 
STRONG  contradictions  is  the  usual  concept  of  contradicton 
used  in  theorem  proving:  There  are  predicates,  (x  r  y)  and  "  (u 
r  v)  in  a  THASSERT,  and  value  assignments,  say  p  for  x.-and  u, 
and  q  for  y  and  v.  If  such  valuations  exist,  then  the  THASSERT 
will  find  them.  In  each  THASSERT  we  first  look  for  strong 
contradictions.  If  no  strong  contradictions  exist  then  we 


*3  We  shall  talk  about  only  binary  predicates  in  this  paper. 
In  MDS  there  are  facilities  available  to  consider  n-ary 
predicates  for  any  n>  0.  Also,  one  may  have  function  symbols 
occurring  in  the  first  order  expressions.  Examples  of 
theorems  with  function  symbols  in  them  are  discussed  in 
(8rinivasan  1976c]. 


THEOREM  PROVING  IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page  9  __ 

look  for  weak  contradictions.  We  shall  say  that  a  TP-state 
contradicts  'the  raddel  space  if  there  is  a  strong  or  weak 
Contradiction- in-  every  sequent  -of-  the  TP-state  for  some 
possible,  but  common  choice  of  values  (i.e.  the  same  eigen 
term' occur ring  in'two  different  sequents' should  have  the  same 
value'  chosen  for  it  in  both  sequentsj >  for  the  eigen  terms  in 
the  TP-state.  Notice  that  the  contradiction  is  defined  with 
respect  to' the 'model  space.  We  are  making  the  assumption  here 
that  our  domain  definitions  are  such  that  if  there  is  a 
contradiction  with  respect  to  the  model  space  then  in  the 
TP-process  it  will  eventually  show  up  as  a  strong 
contradiction  —  i.e.  the  model  space  is  complete  and 
consistent.  A  basis  for  this  assumption  is  the  residue 
theorem,  mentioned  below.  If  there  are  inconsistencies  in  the 
domain  definition  then  it  is  impossible  to  predict  what  might 
happen.  The  collection  of  THASSERTs  is  used  to  find 
valuations  for  the  eigen  terms  in  the _  TP-state,  that  will 
produce  a  contradiction  with  the  model  space  in  every  sequent 
In  the  TP-state,  if  such  a  contradiction  is  possible. 

-"'  For  each  eigen  term  its  possible  range  of  values  will  be 
seperated  into  two  parts:  Those  thatr:do  not  produce  a 
contradiction  in  any  of  the  THASSERTs,  and  those  that  produce 
a  contradiction  in  at  least  one  THASSERT.  If  no  contradiction 
is  encountered  for  any  of  the  possible  assignments  of  values 
lor  the~  eigen  terms  in  i  THASSERT,  ' then  the  appropriate 
UNKNOWN  residue . (See  Srinivasan  1976bJ’  is  returned  "to  the 


THEOREM  PROVING  IN  MDS. . .C.V. Sr inivasan,  Jan.  1976. Page  10 

TP-state,  if  one  existed.  The  unknown  residue  is  substituted 
for  the  predicate  associated  with  it,  in  the  appropriate 

sequents  as  explained-  in  Table  III. 

*  — i 

For  a  predicate,  (P) ,  with  or  without  quantifiers,  the 
unknown  residue  exists  only  if  (P)  has  truth  value,  ? 
(UNKNOWN) ,  for  the  given  valuations  of  the  terms  in  the 
predicate.  The  residue  then  is  the  part  of  the  predicate  (P) 
that  are  UNKNOWN  in  the  model  space.  One  may  view  these  as 
the  conditions  under  which  a  given  assertion  to  the  model 
space  can  become  true.  In  building  models  using  the  residues 
we  make  use  of  the  RESIDUE  THEOREM  discussed  in  [Sr inivasan 
1976b] :  Let  V  be  the  valuation  of  the  terms  in  (P) ,  and  let 
R[V] (P)  be  the  residue  of  (P)  for  the  valuation  V.  Let  PV  be 
the  predicate  P  with  the  known  valuations  in  V  kept  fixed. 
Then  the  residue  theorem  says  that  R[V] (P)  <->  (PV) .  That  is 
any  model  that  is  valid  for  R[VJ (P)  is  valid  also  for  (P) . 
Thus,  any  model  that  contradicts  R[V] (P)  would  also  contradict 


THEOREM  PROVING. IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page  11 


TABLE  III:  CONTROL  STRUCTURE  FOR  THE  THEOREM  PROVER. 

Steps  1  through  4  constitute  a  STAGE  in  the  TP-Process.  Each 
stage  begins  with  a  TP-state  and  ends  with  (possibly)  a  new 
TP-state. 

First  apply  ALL  applicable  propositional  rules,  equality  rules 
and  quantifier  rules.  For  each  new  Eigen.  Term,  z,  keep  note 
of  the  possible  range  of  values  that  the  eigen  term  can  have. 
The  ONEOF  function  is  used  for  this  purpose.  For  each 
application  of  an  Instantiation  rule  do  (CREATE  Tl  v)  — Tl 
here  is  the  type  of  v — if  the  model  space  permits  the  creation 
of  new  instances  of  Tl.  The  symbol  v  is  a  new  symbol  not  used 
previously  in  the  TP-process.  If  the  creation  of  new 
instances  is  prohibited  in  the  model  space,  then  create  a  new 
eigen  variable  say  v,  and  set 

v  *  (ONEOF  <the  instances  of  Tl  in  the  model  space. 

These  instances  are  not  to  include  any  of 
the'  previously  generated  eigen  variables 
in  the  TP  process. >). 

Please  note  that  v  is  still  treated  here  as  an  eigen  variable, 
and  NOT  as  an  eigen  term.  The  ONEOF  function  is  used  only  for 
existential  instantiations.  If  v  is  the  result  of  an 
universal  instantiation  then  mark  v  by  the  label  ALL,  else 
nark  it  by  the  label  SOME. 

For  each  sequent  in  the  TP-state  do  the  following:  Let  pi, 
p2,...,  pn  be  all  the  predicates  of  the  form  (x  r  y)  on  the 
left  hand  side  of  a  sequent  in  the  TP-state,  and  let  ql,  q2, 
...,  qm  be  the  similar  predicates  on  the  right  hand  side  of 
the  same  sequent.  Do 

A:  [THASSERT  pi  p2  . . .  pn  “ql  "q2  ...  “qm] . 

This  might  cause  these  elementary  predicates  to  be  moved  into 
the  model  space.  Let  Al,  A2,  ...,  Ak  be  all  the  assertions 
nade  to  the  model  space,  in  this  manner.  Let  PI,  P2,  ...,  Pk 
be  the  strings  of  all  the  predicates,  respectively  in  Al,  A2, 
...,  Ak.  Let  SEQ1,  SEQ2,  ...,  SEQk  be  respectively  the 
sequents  with  which  the  assertions  are  associated.  A 
THASSERT,  Ai,  is  used  to  update  the  model  space  only  if  it 
does  not  contradict  any  of  the  HYPOTHESIS  (see  (c)  below) 
generated  by  the  TP-process.  If  a  THASSERT  does  r  ^t  cause  a 
contradiction  in  the  model  space,  but  contradicts  a 
hypothesis,  then  we  shall  not  update  the  model  space.  The 
asociated  sequent  will  be  left  in  the  TP-state  unchanged. 

One  of  the  following  three  can  now  happen  for  each  such 
THASSERT,  Ai: 

(a)  All  the  predicates  in  Pi  are  accepted  by  the  model 
space.  Some  of  them  are  unconditionally  accepted  by  the  model 


THEOREM  PROVING  IN  MOS. . . C. V. Sr inivasan,  Jan.  1976. Page  12 


apace.  In  this  case  delete  the  accepted  elementary  predicates 
in  Pi  from  the  sequent  SEQi.  If  as  a  result  of  this,  SEQi , 
becomes  NULL  (both  its  left  and  right  sides  become  empty) , 
then  undo  the  THASSERT,  Ai.  The  predicates  in  Pi  are  now 
candidates  for  a  new  HYPOTHESIS.  Some  of  the  predicates  are 
accepted  by  the  model  space  conditionally.  An  UNKNOWN-residue 
(See  Srinivasan  1976b  for  the  definition  of  residues)  is 
returned  for  each  such  conditionally  accepted  predicate.  In 
this  case,  substitute  each  such  predicate  in  SEQi  by  its 
associated  residue. 

(b)  One  or  more  of  the  predicates  is  NOT  accepted  by  the 
model  space.  In  this  case  simply  delete  the  sequent  from  the 
TP-state  and  make  note  that  a  contradiction  has  occurred. 
Keep  note  of  the  assignments  of  values  to  the  eigen  terms  that 
caused  the  contradiction. 

(c)  After  doing  all  the  THASSERTs  and  the  associated 
updates  to  the  sequents,  see  whether  there  are  any  NULL 
sequents  in  the  TP-state.  If  SEQi,SEQj , . . . , SEQt  are  all  NULL 
then  delete  them  from  the  TP-state,  and  generate  the  following 
hypothesis: 

HYP:  “(Pi  V  Pj  V  ...  V  Pt) , 

where  Pi,  Pj,  ...,  Pt  are  respectively  the  predicates 
appearing  in  the  assertions  Ai,  Aj,  ...,  At,  associated  with 
the  sequents  SEQi,  SEQj ,  ...,  SEQt. 

If  the  TP-state  is  empty,  and  if  a  contradiction  has  occurred 
then  the  theorem  is  proven.  The  solution  may  now  be  extracted 
from  the  model  in  the  model  space.  The  solution  is  found  by 
assigning  the  proper  values  for  the  eigen  terms.  We  shall  not 
discuss  here  the  soluion  extraction  process. 

If  no  contradiction  has  occurred  then  GO  AND  CONSULT  USER 
about  what  to  do  next.  We  shall  discuss  some  of  the 
strategies  appropriate  for  this  situation  in  an  ensuing  paper 
(Srinivasan  1976c]. 

If  the  TP-state  is  not  empty  then  go  to  step  1. 


THEOREM  PROVING  IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page  13 

When  a  predicate  is  asserted  into  the  model  space  it 
initiates  a  rather  complex  process.  First  the  constraints 
associated  with  the  predicate  and  the  constraints  in  all  its 
dependent  predicates  are  evaluated.  If  there  are  also 
TRANSFORMATION  RULES  (See  Sr inivasan  1976b,  1975b)  associated 
with  any  of  the  relations  asserted  into  the  model  space  they 
will  also  get  evaluated  and  all  the  appropriate  "side  effects" 
will  be  taken  care  of.  In  the  case  of  an  assertion  (x  r  y) , 
where  both  x  and  y  are  eigen  variables  (i.e.  x  and  y  are 
constants  already  in  the  model  space) ,  this  assertion  may 
actually  cause  the  models  in  the  model  space  to  change.  The 
appropriate  residues  will  then  be  returned  to  the  TP-state. 
In  the  case  of  eigen  terms,  we  shall  associate  with  each  eigen 
term  its  range  of  possible  values:  Those  leading  to  a 
contradiction  as  well  as  those  that  do  not  lead  to  a 
contradiction  at  a  given  stage  of  the  TP-process.  For  an 
eigen  term,  y,  of  type,  say  Y,  we  will  maintain  as  possible 
values  only  the  instances  of  Y.  We  shall  write  this  as: 
y  ■  ONEOF[(yl  y2  . . .  yn;  zl  z2  ...zm)],  where  the  z‘s  cause 
contradiction.  We  shall  use  the  notation  (x  r)  to. denote  the 
collection  of  all  y  such  that  (x  r  y)  is  true  in  the  model 
space  and  at  times  write  y  *  (ONEOF(x  r ) )  to  indicate  that  the 
possible  values  of  y  are  precisely  those  that  satisfy  (x  r) . 

The  assertions  into  the  model  space  perform  four 
functions:  (a)  Recognize  contradictions,  if  any,  (b)  delimit 
the  scopes  of  the  relevant  eigen  terms  if  possible. 


THEOREM  PROVING  IN  MDS. . .C.V. Srinivasan,  Jan.  1976. Page  14 

(c)  generate  HYPOTHESIS  to  guide  the  TP  in  its  operations  (we 
shall  discuss  this  in  greater  detail  in  section  5),  and 

(d)  return  to  the  TP-state  additional  constraints  relevant  to 
the  proof  of  the  assertion  in  the  form  of  UNKNOWN 
residues — logical  expressions  whose  truth  values  could  not  be 
as  yet  determined  in  the  model  space,  because  the  needed 
information  is  not  yet  available.  With  these  preliminary 
comments  we  may  now  consider  the  solution  of  the  maze  problem. 

4.  THE  MAZE  PROBLEM. 

To  understand  the  definitions  given  in  Table  IV  please 
see  [Srinivasan  1976a].  To  follow  the  discussion  below  it  is 
essential  that  the  reader  understand  Table  IV  and  the  concept 
of  residues  [Srinivasan  1976b).  Let  us  assume  that  an 
instance  of  MAZE  has  been  already  created  in  the  model  space 
and  the  connectivity  of  all  its  nodes  have  been  specified;  no 
new  nodes  might  be  created  in  the  model  space.  For  some 

particular  node,  n,  of  this  maze  we  wish  to  prove  the 

assertion  [al]  of  Table  V.  We  shall  comment  on  the  proof 

shown  in  Table  V  in  the  next  section.  This  table  is  mostly 

self  explanatory. 


THEOREM  PROVING. IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page  15 


TABLE  IVs  THE  DEFINITION  OF  THE  MAZE  DOMAIN. 


{(MAZE  $N) 


CC1: 


CC2: 

CC3: 


((NODES  $L) 


((NODE  $N) 


CC4: 

CC5: 


((PATH  $N) 


CC6: 


CC7: 


(startingnodes  (NODES  $L)  startingnodesof  CC1) 

(exit  (NODE  $N)  exitof  CC2) 

(contains  .(NODES  $L)  belongto  CC3) ] . 

[MAZE  startingnodes] . 

(((NODE  X)  |  (@  startingnode  X)  (X  connected  front  NIL)] 
[MAZE  exit]. 

((NODE  X) | (@  exit  X)  (X  connectedto  NIL)]. 

(MAZE  contains] 

[(NODE  X) | (@  startingnode  X)  V  (@  exit  X)  V 
((SOME  NODE  Y)  (@  contains  Y) 

(Y  canreach  X) ) ] . 

(eleradn  (0  *  NODE)) 

( (connectedf rom  V)  (NODE  $N)  connectedto) 

((belongto  V)  (MAZE  $N)  contains)]. 

(connectedto  (NODES  $L)  connectedfrom  CC4) 
((canreach  $X)  (NODES  $L)  canbereachedfrora  CC5) ] . 

[NODE  connectedto] . 

((NODE  X) | (@  connectedto  X) (NOT(X  is  @) ) ] . 

[NODE  canreach] . 

[ (NODE  X) | (@  connectedto  X) ] . 

(startingnode  (NODE  $N)  startingnodeof ) 

(tail  (PATH  $N)  tailof  CC6) 

((endingnode  $)  (NODE  $N)  endingnodeof  CC7) ] . 

[PATH  tail]. 

[(PATH  P) | (@  tail  P) 

((P  is  NIL)  V 
((SOME  NODE  N) 

(P  startingnode  N) 

(@  startingnode : is: connectedto  N) ) ] . 
(PATH  endingnode] . 

((NODE  X)|((@  tail  NIL)<->(@  startingnode  X)) 

((ALL  PATH  P) 

(@  tail  P)->(P  endingnode  X)]. 


THEOREM  PROVING  IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page  16 


TABLE  V:  STATEMENT  AND  SOLUTION  OF  THE  MAZE  PROBLEM. 


The  nodes  n  and  e  below  are  assumed  to  be  constants,  already 
in  the  model  space. 


[alj. 


Apply  (=>  SOME)  rule.  No 
Hence,  do  (CREATE  PATH  p) 
of  the  sequent. 

[a2J. 

By  («>  ->) 

(a3J .  (p  startingnode  n) 

(THASSERT  (p  startingnode 

[a4J. 
which  is 

By  (*>  &)  and  (*>  <->) , 

(a5] .  (p  tail  NIL) 

(p  startingnode  e) 


By  (*>  ALL) ,  (CREATE  PATH 
sequent  above,  and  do  (*> 

(p  tail  pi) 

(THASSERT  (p  tail  NIL)~(p 
(THASSERT  "(p  tail  NIL) (p 
(THASSERT  (p  tail  pi) "(pi 


*> ( (SOME  PATH  P) 

(P  startingnode  n)-> 

(P  endingnode  e)) 

prior  instance  of  PATH  exists, 
and  do  [P:p]  for  the  right  hand  side 


*>((p  startingnode  n)-> 
(p  endingnode  e) ) ; 


->(p  endingnode  e) . 

n) " (p  endingnode  e) J . 

•> (Residue  of  (p  endingnode  e) ] . 

«>((p  tail  NIL) <-> 

(p  startingnode  e) ) 

&  ((ALL  PATH  P) 

(p  tail  P)->(P  endingnode  e] . 


*>(p  startingnode  e) ; 

*>(p  tail  NIL) ; 

•> ( (ALL  PATH  P) 

(p  tail  P)->(P  endingnode  e) ) ; 

pi),  and  do  [P:pl]  for  the  last 
->)  to  get 

■>(pl  endingnode  e) . 

startingnode  e) ] , 
startingnode  e) ] ,  ■ 
endingnode  e)  ] . 


If  n  *  e,  *-hen  the  earlier  assertion  (p  startingnode  n)  will 
contradict  with  "(p  startingnode  e) .  In  the  second  THASSERT, 
*(p  tail  NIL)  will  contradict  with  (p  tail  NIL)  of  the  first 
THASSERT.  Similarly,  (p  tail  pi)  of  the  third  THASSERT  also 
will  contradict  (p  tail  NIL).  This  will  terminate  the  proof. 
If  n  is  not  equal  to  e,  then  the  first  THASSERT  will  generate 
the  following  hypothesis  (since  sequent  becomes  NULL  as  a 
result  of  the  THASSERTs) : 


THEOREM  PROVING  IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page  17- 


HYP1:  “((p  tail  NIL)~(p  stactingnode  e) )  -  - 

Notice  that  the  above  hypothesis  is  a  special  case  of  -the 
conditon  CC6,  associated  with  [PATH  tail].  The  TP-pcocess  has 
now  discovered  a  property  of  the  domain  as  ~a  result  of  its 
interactions  with  the  model  space.  It  should  be  mentioned 
that  one  should  be  extremely  lucky  to  discover  properties  in 
this  manner!  The  ability  to  discover  properties  like  this  is 
dependent  on  the  kinds  of  domain  descriptions  that  have  been 
given  to  the  system.  The  second  THASSERT  will  produce  a 
contradiction  in  the  model  space#  since  (p  startingnode  n)  is 
already  true  in  the  model  space.  The  last  THASSERT  will 
produce# 

[a8] .  (Residue  of  (p  tail  pl)]=> 

[Residue  of  (pi  endingnode  e ) ] • 

which  is 

((SOME  NODE  N) 

(pi  startingnode  N) (n  connectedto  N) ) 

■>(((pl  tail  NIL) <— > 

(pi  startingnode  e))& 

((ALL  PATH  P)  . 

(pi  tail  P)->(P  endingnode  e] . 

Notice  that  the  phrase  " (p  startingnode:is:connectedto  N) " 
occurs  in  CC6  from  which  the  residue  on  the  left  hand  side 
above  was  obtained.  However#  _  (p  startingnode)  is  already 
known  to  be  n  in  the  model  space  as  a  result  of  the  assertion 
in-  [a'3].  Thus,  (p  startingnode)  gets  substituted  by  n  in  the 
residue.  We  now  do  (SOME  -»  ,  .(->  4)  #  (*>  ALL)  and  (=>  ->)  . 
Since  the  creation  of  new  nodes  has  been  blocked#  it  is  now 
not  possible  to  CREATE  a  new  node  for  (SOME  =>)  rule 
application.  Therefore#  an  eigen  variable#  vl#  is  created  and 
set  equal  to# 

.  I  _ .  'TVi*  (ONEOF  <the.  nodes  in  the  model  space>) . 

A  value  for  this  vl  has  to  be  found  -  from  the  model  space. 
Notice#  however#  that  vl  does  not  have  the  status  of  an  eigen 
term#  in  the  TP  process.  That  is,  in  the  THASSERT  process  vl 
cannot  be  set  equal  to  another  eigen  variable.  Now#  (CREATE 
PATH  p2)  and  do  [P:p2].  Then  doing  all  the  propositional 
rules#  we  get: 

(*?].  . (pi  startingnode. vl) (n  connectedto  vl) (pi  tail  NIL) 
....  »>(pl  startingnode  e) ; 

(pi  startingnode  vl) (n  connectedto  vl)  -  -  - 

*  ' ’  "  (pi  startingnode  e) 

■> (pi  tail  NIL) ; 

(pi  startingnode  vl) (n  connectedto  vl) (pi  tail  p2) 

2  1 .  ...  ■> (p2  endingnode  e) . 


THEOREM  PROVING  IN  MOS. . .C.V.Sr inivasan,  Jan."  1976. Page  "18 


This  leads  lo,  r:  - ' 

(THASSERT  (pi  stactingnode  vl) (n  connectedto  vl) (pi  tail  NIL) 

“(pi  startingnode  e)  ] , 

"[THASSERT  (pi  startingnode  vl) (n  connectedto  vl) 

<pl  startingnode  e)“(pl  tail  NIL)],  " 

1THASSERT- (pi  startingnode  vl)  (n  connectedto  vl)  (pi  tail  p2) 
.  . .  —  -  "(p2  endingnode  e)  ] . 

The  assertion  of  (n  connectedto  vl)  would  limit  the  scope  of 
possible  values  of  vl  to  those  nodes  that  are  known  to  be 
connected  to  n,  in  the  model  space.  If  vl  contains  e,  then  vl 
can  be  assigned  the  value  e,  and  this  would  produce  a 
contradiction  in  the  first  THASSERT.  Maintaining  the  same 
assignment  for  vl  would  also  produce  contradictions  in  the 
second  and  third  THASSERTs  above.  In  the  case  of  the  second 
ASSERT  "(pi  tail  NIL)  will  not  be  accepted,  since  as  a  result 
of  the  first  assertion  we  now  would  have  (pi  tail  NIL)  in  the 
model  space.  Similarly,  in  the  case  of  the  third  ASSERT  (pi 
tail  p2)  will  not  be  accepted.  This  will  thus  terminate  the 
proof.  The  PATH  p  in  the  model  space  is  the  solution. 

If  vl  does  not  contain  e,  then  the  first  THASSERT  will  cause  a 
NULL  sequent.  Its  predicates  will  become  candidates  for  a  new 
hypothesis.  The  second  THASSERT  will  cause  a  contradiction, 
'since  (pi  startingnode  e)  cannot  be  accepted.  It  would 
violate  CC6.  In  the  third  ASSERT  We  will  have, 

[alO].  [Residue  of  (pi  tail  p2)] 

--  -■> [Residue  of  (p2  endingnode  e) ] . 

'which  would  exactly  be  the  same  as  [a8J  with  the 
'substitutions:  of  p2  for  pi  and  pi  for  p,  and  the  process  [a8] 
'through  [alO]  will  be  repeated.  This  will  iterate  untill  a 
path  is  reached  whose  starting  node  could  be  e.  In  the  case 
we  would  have  in  the  model  space  the  following: 

- (p  startingnode  n) (pi  startingnode  vl) (p2  startingnode  v2) 

•  ...  (pi  startingnode  vi) (p(i+l)  startingnode  e) , 

r for  some  integer  i.  Also  we  would  have, 

r._  -  vl  «  (ONEOF  (n  connectedto);  . ..), 
v2  *  (ONEOF  (v2  connectedto);  ...)* 

vi  ■  (ONEOF  (v(i-l)  connectedto);  ...). 

The  paths  that  are  solutions  to  the  problem  may  now  be 
extracted  from  this. 


•  -e*  ..<•• 


THEOREM  PROVING  IN  MDS. . . C. V.Sr inivasan,  Jan.  1976. Page  19 

5.  DISCUSSION  OF  THE  PROOF.  __ 

7-  In  this  proof  we  have  assumed  the  simplest  situation.  We 
already  knew  the  'full  description.  .  of. .  the  maze.  If  the 
description  of  the  maze  were  not  'available  then  the  above 
proof  might  not  have  terminated.  The  theorem  prover  does  not 
have  the  ability  to  recognize  that  some  information  is  missing 
and  could  be  acquired  by  consulting  with  an  outside  agent. 
The  process  will  not  terminate  also  in  cases  were  there  is  no 
path,  and  there  is  a  loop  in  which  the  search  is  trapped.  We 
could  have  defined  the  domain  differently  to  take  care  of  the 
loop  problem:  One  would  introduce  a  new  relation  in  PATH  which 
identifies  all  the  nodes  in  the  path  prior  to  a  given  PATH 
location,  and  insist  that  the  startingnode  of  a  tail  should 
never  be  one  of  the  nodes  already  appearing  in  the  prefix  of  a 
path.  There  are  facilities  in  the  MDS  model  space  also  to 
indicate  situations  in  the  description  of  an  object  where  a 
missing  piece  of  information  has  to  be  obtained  from  an 
outside  agent.  In  the  case  of  a  maze  one  might,  for  example, 
specify  that  the.  connecting  nodes Tof  a  given  node  has  to  be 
obtained  from  an  agent.  So  also,  one  might  specify  that  the 
starting  node  of  a  PATH  should  be  obtained  from  outside 
consultation.  In  general,  for  every  relation  in  the  various 
schemas  shown  in  Table  IV,  if  the  relation  does  not  have  a 
constraint  associated  with  it,  then  one  might  say  that  to 
acquire  that  piece  of  information  one  has  to  consult  with  an 
outside  agent.  This  can  be  done  in  MDS  by  associating  flags 


THEOREM  PROVING.  IN  MDS. . .C.V.Sr iniyasan,  Jan.  1976. Page  20  — 


with  the  relations  [See  Irwin,  J.&Srinivasan  1975]. 


The  organization  described  here  does  not  provide  a  good 
search  strategy  to  •  look  for  the  right  kinds  of  bindings  for 
•^the  eigen' terras  . '  In  “general  the  problem  of  estimating  the 
-possible  bindings  and  choosing  the  right  ones  is  an  unsolvable 


problem.  Usually  one  just  carries  along  all  the  available 
values'  and  does  not  make  any  commitment  until  a  contradiction 
~is  identified.  In  the  procedure  described  above  we  do  make 
commitments  on  the  choice  of  values  based  on  what  the  model 
space  accepts.  At  times  this  leads  to  NULL  sequents  in  the 
TP-state.  In  the  example  discussed  above  we  had  a  lucky  set 
of  circumstances:  We  encountered  the  hypothesis  before  we  made 
any  commitments  on  the  choice  of  values  for  any  of  the  eigen 
-terms i  In  general;  when  a  sequent 'becomes  NULL  one  is  in  a 
situation  where  one  might  have  to  do  some  back-tracking,  and 
undo  some  of  the  changes  previously  made  in  the  model  space. 
Ah  important  property  of  Gentzen's  system  is,  it  can  run  in 
either  direction;  there  is  no  information  loss.  This  is  not 
true  for  the  RESOLUTION  approach.  When  a  sequent  becomes  NULL 
owe  should-  back  track  to  the'  previous  TP-state  and.  undo  any 
changes  that  might  have  been  made  in  the  model  space  by  the 
THASSERT  associated  with  with  the  sequent  corresponding  to  the 
NULL  sequent.  This  might' cause  one  to  reevaluate  the  previous 
TP-state  all  over  again,  and  change  many  of  the  bindings 
chosen  for  the  eigen  terms.  We  shall  not  discuss  here  the 
organization  of  the  model  space  and  the  TP-state  that  make 


THEOREM  PROVING  IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page  21 

such  back  tracking  feasible.  Again,  if  the  domain 
descriptions  are  right,  hopefully,  such  back  tracking  would  be 
avoided. 

* 

If  the  TP-state  becomes  null  and  no  contradiction  had 
been  obtained,  and  possibly  also  some  hypothesis  are 
available,  then  in  certain  cases  it  might  still  be  possible  to 
continue  with  the  proof  procedure  if  the  right  kinds  of  new 
instances  of  the  necessary  objects  are  created  in  the  model 
space.  This  might  be  done  by  consulting  with  a  user.  Also, 
in  these  situations,  at  times,  the  system  can  use  the  so 
called  "model  completion"  criteria  to  continue  with  the  proof. 
The  concept  of  model  completion  is  discussed  in  detail  in 
Srinivasan  [1976c].  The  objective  in  model  completion  is  to 
create  new  objects  in  the  model  space  necessary  to  complete 
all  the  missing  (unknown)  information  in  the  model. 

The  thrust  of  this  work  has  been  so  far  not  in  the 
identification  of  good  search  strategies,  but  in  creating  a  TP 
control  structure  that  can  interact  with  a  model  space  which 
has  itself  been  defined  in  a  context  totally  independent  of 
the  TP  process.  Also,  it  seems  the  TP  can  explain  its 
operations  to  a  user  in  a  natural  way. 

The  procedure  presented  here  is  bound  to  be  incomplete, 
because  of  the  weak  requirement  for  contradiction:  The  notion 
of  contradiction  is  dependent  on  the  model  space.  However,  if 
the  domain  description  is  right  then  we  can  prove  interesting 


THEOREM  PROVING  tN  MDS . . . C . V. Sr inivasan ,  Jan.  1976. Page  22 

theorems.  We  do  not  use  theorem  proving  as  the  principal  mode 
pf  problem  solving  in  MDS.  Our  objective  had  been  to  fulfill 
an  aesthetic  theory:  If  one  could  describe  knowledge  to  a 
computer'  and  have  the' computer  understand  the  knowledge  then 
one  should  be  able  to  use  the  knowledge  not  only  in'  highly 
specialized  ways  but  also  in  'the'  context  of  a  "general 
deductive  system.  The  TP  is  expected  to  be  used  in  MDS  to 
guide  construction  of  instances  of  schemas,  and  to  resolve 
unanticipated  conflicts  in  the  updating  of  the  model  space. 
Our  ideas  on  the  use  of  TP  in  these  areas  are  at  the  moment 
still  in  the  developing  phase.  The  central  problem  is  one  of 
using  the  TP  in  reasonably  efficient  ways  to  help  in  the 
updating  process. 

.  We  expect  to  have  the  MDS  model  space  management  system 
working  in  a  few  months.  The  TP  is  not  yet  implemented.  We 
are  still  #t  the  stage  of  exploration. 

ACKNOWLEDGEMENTS:  -  ?*:  -'*■  •  * '  -*  “  - 

During  the  early  stages  of  development  of.  the*-  ideas 
presented  here,  in  the  spring  of  75,  when  we  were  still 
learning  about  Gentzen's  system  and  Beth's  Semantic  Tableaux  I 
had  the  benefit  of  many  discussions  with  Jared  Darlington. 
These  discussions  were  extremely  valuable  in  crystallizing 
many  of  my  ideas.  The  discussions  I  had  with  my  student,  Tau 
Hsu  were  very  useful  for  identifying  the  deficiencies  in  the 
TP  process  and  correcting  some  errors. 

5.  REFERENCES: 

Be£h,  E.W(1959]  The  foundations  of  Mathematics,.  North  Holland 
"publshing  company,  Amsterdam. 

Hanger,  Stig  [1963]:  A  simplified  proof  method  for  elementary 
logic.  In  Computer  Programming  and  Formal  systems. 


THEOREM  PROVING  IN  MDS. . .C.V.Sr inivasan,  Jan.  1976. Page  23 


Beth,  E.W[1959]  The  foundations  of  Mathematics,  North  Holland 
publshing  company,  Amsterdam. 

Kanger,  Stig  [1963]:  A  simplified  proof  method  for  elementary 
logic.  In  Computer  Programming  and  Formal  systems, 
Braf fort&Hirschberg  (Eds),  North  Holland  Publishng 
Company,  Amsterdam. 

Srinivasan,  C.V.  ,[1973]:  "The  Archtecture  of  Coherent 
Information  Systems:  A  general.  Problem  Solving 
System",  in  3IJCAI,  Stanford  1973.  A  revised 
version  of  this  paper  appears  in  IEEE  Transactions 
on  Computers,  Special  issue  on  Artificial 
Intelligence,  April  1966. 

[1975a] :  A  formalism  to  define  the  structure  of 
knowledge,  RUCBM-TR-51,  Department  of  Computer 
Science,  Rutgers  University,  New  Brunswick,  N.J. 
[1975b]:  The  meta  Description  System,  RUCBM-TR-50, 
Department  of  Computer  Science. 

[1976a] :  "Introducton  to  the  meta  description 

system".  Department  of  Computer  Science  Technical 
Report,  S0SA0-TR-18,  Rutgers  University.  [1976b] : 
"The  model  space  of  the  meta  descript io  system". 
Department  of  Computer  Science  Technical  Report, 
SOSAP-TR-19,  Rutgers  University. 

[1976c]:  "The  Proof  of  Cantor's  Theorem  in  MDS",  in 
preparation. 

Irwin,  J.&. . . : [1975] :  The  description  of  CASNET  in  MDS., 
RUCBM-TR-49,  Department  of  Computer  Science. 


RUCBM-TR-49 


August  1976 


DESCRIPTION  OF  CASNET  IN  MDS 
J.  Irwin  and  C.  V.  Srinivasan 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

New  Brunswick,  New  Jersey 


This  research  was  partially  supported  by  the  Advanced  Research 
Projects  Agency  of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6 
to  the  Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U.  S.  Government. 


JESCRIPTION  OF  CASNET  IN  MDS..JUNE  1975 


table  of  Contents 


1 •  In  troducticn, 

2.  Basic  concepts:  CASNET  and  the  METHODOLOGY  o 

description  in  HDS. 

3.  The  DESCRIPTION  STRUCTURE  of  CASNET. 

4.  Templates  and  their  instantiations. 

4.1:  Types  of  templates  and  examples  of  their  us 
in  CASNET. 

4.2:  The  relation  flays  and  their  use. 

4.3:  The  CONSISTENCY  CONDITIONS. 

4.3.1:  The  form  of  CC's. 

4.3.2:  The  interpretation  cf  CC's. 

5.  The  MODEL  DEFINITION  and  MODEL  INSTANTIATION  processes 

5.1:  The  Definition  of  CAUSALMCDEL. 

5.2:  MODEL  INSTANTIATION  process. 

5.2.1:  The  Record  Keeping  Process. 

5.2.2:  The  test  selection  ana  application. 


u.  Concluding  Remarks. 
7.  Acknowledgements. 


REFERENCES 


APPENDIX  I:  CASNET  DEFINITION 


JEi  JaU'TION  OF  CASMET  IN  HDS. .  JUNE  1975.. 


Page  1 


1.  Introduction. 

This  report  presents  a  formal  description  ot  the  CAS  NET 
system  of  Kulikowski  and  Meiss.  [ Weiss, 1972 ],  in  the  formalism 
or  the  Meta  Description  System  (HDS)  [Srinivasan  •••]•  The 
report  is  addressed  to  readers  who  have  some  knowledge  of 


tue  CASNET 

system. 

The 

purpose  ox  the 

report  is 

tc 

illustrate 

the  use 

of 

HDS  formalism. 

its  power 

and 

flexibility.  The  description  of  a  system  in  the  HDS 

4 

rormalisv  not  only  serves  as  a  precise  documentation  of  what 
a  system  is.  but  also  as  a  prograr  that  implements  the 
described  system  in  the  context  of  the  processors  in  HDS. 
Thus,  HDS  can  not  only  execute  the  system  described  to  it. 
uui  also  asnwer  questions  about  the  described  system.  The 

.  f 

report  is  written  in  such  a  way  that  readers  familial  with 
CiSNET  will  got  a  fairly  gocd  grasp  of  the  facilities,  power 
dad  advantages  of  using  the  neta  Description  System. 

2.0:  Basic  Concepts:  CASNET  and  the  METHODOLOGY  of 

description  in  HDS. 

CASS BT  is  a  system  for  modeling  uisease  processes  in 
terms  cf  CAUSAL  NSTs,  The  nodes  of  a  causal  net  would 
represent  the  so  called  "disease  states",  and  directed  links 
between  pairs  of  states  (nodes)  woula  represent  the 
"causality"  relation  between  the  states.  Ihus,  (si  - >  s2)  . 

would  indicate  that  the  occurrence  ot  the  disease  state  si 
at  one  time,  might  cause  a  later  occurrence  of  the  disease 
state  -32.  Both  nodes  and  links  in  a  causal  net  are  weighted 


JcIJwrt^^TION  OF  CASNET  IN  HUS. .JUNE  1975.. 


Page  2 


objects.  The  weight  of  a  node  is  used  to  determine  its 
state  of  conf icnation ,  and  the  weiqnt  or  a  link  would 
repra-ent  the  strength  of  causality.  These  weights  are  in 
general  vectors.  For  a  given  (si  ->  s 2)  tue  weights  of  si, 
s2  and  the  link  would  he  required  to  satist *  prescribed 
rales  of  consistency.  Thus,  for  example,  if  si  has  the 
status  CONFIRMEE,  and  the  strength  of  the  causal  iinx  is  say 
1,  then  one  night  require  that  the  status  of  s2  should  also 
jj  CONFIRMEE. 

We  shall  refer  to  the  disease  nodel  useu  uy  the  CASNET 
system  as  the  CAUSALMODEL.  To  model  a  disease  in  CASNET  one 
ms  to  first  define  the  CAUSALM0D2L  for  tne  disease.  The 
CA USALHODEL  will  necessarily  include  a  description  of  the 
causal  net  to  be  used  for  the  disease.  We  shall  refer  to 
tuis  as  the  CAUSA LK&IDEF N.  In  the  MDS  formalism  we  shall 
say  that  both  the  relational  forns 

(CAUS  ALHODEl  causaln etdef n  CAUSALNETDEFN) ,  and 

(CAUSALNETDEFN  causaln etdef nof  CAUS ALHODEL) , 

are  well  defined  for  the  dona  in  of  CASNET.  For  the 

processors  in  HDS  this  would  nean, 

"For  every  CAUSA IMCOEL,  the  causalnetdsi n  cr  the 

CAUSALHODEL  is  a  CAUSALNETDEFN." 

For  a  qiven  disease  its  associated  Cm US Ah  n  EVDEFN  will 
specify  the  states  of  the  disease,  tno  causal  links,  fch'ir 
weights  and  constraints.  Thus,  if  a  CAUSALmlUtL  had  boon 
dofinea  lor  GLAUCOMA,  then  the  phiuuw  (GLAUCOMA 


GESEKIPTION  OF  CAS NET  IN  MDS..JUNE  1975.. 


Page  3 


causaln  etdef  n)  will  produce  in  MDS  the  definition  of  the 
LaUSALN ETDEFN  of  GLAUCOMA. 

In  the  CASNET  system  one  oiay  also  define 
correspondences  between  the  weights  oi  the  uist-ase  states 
and  the  testable  symptoms  of  a  disease  process.  These 
correspondences  are  then  used  to  construct  a  causal  n^t 
description  of  a  disease  process  from  tne  set  oi  observed 
symptoms  of  an  afilicted  patient.  In  tne  NDS  tormaiism  we 
snail  indicate  this  feature  by  defining  the  relation  iorms: 

(CAUSALMODEL  testdesns  TESTDESNS) , 

(TESTDESNS  test.desnsof  CAUSALMODEL)  ,  ana 
(TESTDESNS  elements  TESTDESN )  . 


aore  TESTDESNS 

has  been 

defined 

to 

a  collection  of 

iLSTDESNs,  an 

arbitrary 

number 

cr 

trier..  Each  TESTDESN 

itself  will  specify  one  test  and  the  interpretations  for  its 
results,  namely,  the  correspondences  between  the  test 
results  and  the  weights  of  the  disease  states.  A 
CAUSALMODEL  may  have  a  collection  oi  such  TESTDESNs 
associated  with  it.  Again,  if  the  CAUSAL  MO  DEi  xoc  GLV.ICWW 
were  available  to  MUS,  the  phrase  (GLAUCOMA  testdesns)  would 
refer  to  the  collection  of  all  tests  uefineu  for  the 
GLAUCOMA  causal  model. 

The  collection  of  confirmed  states  in  tne  causal  net 
description  of  tu*»  disease  at  r  licting  a  *,  iv>:i  patient, 
together  with  tueir  associated  causal  cnains,  are  used  in 


CrtsNFT  to  determine  the  disease  diagnosis  ana  possibly  ih;o 


udsChiETION  OF  CAS  NET  IN  ft  DS« .  JUNE  1975.. 


Page  4 


tae  necessary  therapy.  The  associations  between  the  causal 
net  descriptions  of  a  disease  process  anu  the  disease 
diagnosis  and  therapy  are  specined  in  CASNET  by 
(CAUSALMODEL  classifications)  .  In  the  MDS  description  of 
CASNET  we  shall  introduce  the  relational  roras: 

(CAUSALMODEL  classifications  GLASSCEFNS), 

(CLASSLEFNS  cl assi f ica tionso £  CAUSALEODEL)  ,  ana 

(CL  A3SDEFNS  elements  CLASSDEFN). 

uere  again,  a  CAUSALEODEL  may  have  a  collection  of 
CrASSDEFNs  associated  with  it.  Eacn  CLASSDEFN  will 
introduce  a  rule  of  classification  associated  with  a  given 
disease  domain. 

The  disease  states,  the  causal  linxs,  their 
associations  with  disease  symptoms,  ana  tae  correspondences 
between  an  instance  of  a  causal  net  anu  the*  diagnosis  and 
tuerapy  of  the  disease  represented  by  the  causal  net,  would 
ail  depend,  of  course,  on  the  nature  or  the  disease  being 
modelled.  In  the  conception  of  the  CASNET  system  there  are 
tnus  two  phases  cf  operation:  The  first  is  the  domain 

definition  phase.  In  this  phase  the  CAUSALMODEL  for  a 
disease  domain  would  be  defined.  The  second  phase  is  the 
domain  execution  phase.  In  this  phase,  the  CAUSALMODEL 
definition  would  be  used  to  generate  a  causal  net 
description  or  an  instance  of  th?  disease  in  an  afflicted 
:>.i  ti-Mit  •  In  the  implemented  version  ta  is  is  not  ho  wever 
quite  true.  Besides  defining  the  CAUSnLhu DE^  for  a  disease 
dot#  a  in  a  user  is  required  to  also  writ-'  a 


few  dcmiin 


i-Ri.  i'TION  OF  CAS'iET  IN  KD5..JUNE  1975.. 


Page  5 


specific  programs  in  order  to  be  able  to  apply  the  defined 
Ca USALMODEL  to  generate  causal  net  descriptions  of  instances 
oi  the  disease.  These  domain  specific  programs  may  vary 
widely  from  one  disease  domain  to  another.  Thus,  in  the 
case  of  the  CASNET  defined  for  GLAUCOMA  the  disease  is 
i  it ined  only  for  one  eye.  Special  routines  had  to  be 
written  to  consider  the  disease  affliction  for  both  eyes 
simultaneously.  The  version  described  nere  is  close  to  (**) 
tae  implemented  version,  described  in  Weiss  [ 1974]  (see 
section  6  for  a  discussion  of  what  has  been  left  out). 

In  the  HDS  description  the  domain  definition  process 
would  manifest  itself  in  the  following  manner:  One  would 
nrst  create  the  causal  model  definition  for  a  disease 
domain  by  creating  an  instance  of  the  object,  called 
JA U SALK  CDS L.  If  the  disease  domain  is,  for  example, 
GLAUCOMA,  then  one  may  name  this  instance  of  CAUSALMODEL,  by 
the  name  GLAUCOMA.  The  causal  model,  GLAUCOMA,  would  have 
associated  with  it,  its  own  specific  instances  of 
CAUSALN  ETDEFN ,  TESTDESNS,  and  CLASSDEFNS.  Let  (GLAUCOMA 
causalnetdef n)  be  called,  GCASNET.  Similarly#  let  (GLAUCOMA 
testdesns)  be  (qtl  gt2  ...  gtN) ,  collectively  referred  to 
oy  GTESTS ,  and  (GLAUCOMA  classifications)  bo  GC LASS ES  =  (gel 
gc2  ...  gcK).  Ine  domain  definition  for  CLaUcCMA  will  thus 
consist  of  GCASNET,  GTESTS  and  GCLAJSiiS.  oy  applying  the 


**  It  should  be  pointed  out  that  it  would  bo  relatively  easy 
to  modify  th»  M DS  description  of  the  CASNfcT,  given  hec<3,  to 
automatically  tan,!  care  of  disease  manifestations  in  more 
tuan  one  organ  ct  an  afflicted  organism. 


.  '  j&oCai  PTION  OF  CASKET  IN  MDS..JUNE  1)75..  Paqe  6 

jCASNET,  GTESTS  and  GC LASSES  to  particular  instances  of  the 


GLAUCOMA  disease  in 

qi  ven 

patients. 

one 

may  now  create 

descriptions  of  the 

disease 

process 

in 

the  patients 

concerned.  Ke  shall 

refer  to  these  di 

sea  se 

descriptions  as 

uiSEASZUESNS.  To  take  care  of  the  generation  of  those 
JISEASEEE3KS  we  shall  introduce  tnc  following  additional 
relations  to  the  CAUSALMCDEL, 

(CAUSALKODEL  diseasedesns  El S E ASEDESNS) , 

( DISEASEDESKS  causalmodel  CA US ALMODEI) ,  and 
(DISEASEDESNS  elements  DISEASE DESK) . 

Now,  every  CAUSALKODEL  may  have  a  collection  of  DISEASEDESNs 
associated  witn  it.  The  definition  or  DIS  lASL’DESN  will 
specify  how  the  (CAUSALKODEL  causa lnetdei  n)  ,  (CAUSALKODEL 
tastdesns)  and  (CAUSALKODEL  classifications)  would  te  used 
to  create  instances  of  DISEASSDESN. 

In  this  preamble  we  have  already  introduce-d  the  central 
properties  of  the  CASNET  system  and  the  forms  their 
descriptions  will  take  in  MDS.  The  objects  like 

CAUSALMODEL,  CAUSALNETDEFN,  TESTDESNS,  etc.,  that  were 
introduced  above,  are  called  TEMPLATES.  The  CAUSALMODEL 
template  has  four  relations  defined  for  it:  causalnetdefn , 
tastdesns,  classifications  and  diseaseu»_‘sijs.  As  with  the 
CaUSALMCDEL  the  structure  ar.d  components  oi  a  CAUSALNETDEFN 
Wi.il  ue  specified  by  the  template  for  CAUSA  LNLTPZKN. 


Similar  considerations  hold  for  the  other  relations  defined 


tor  the  CAUSALKODEL 


DESC  tit  DTION  OF  CASNET  IN  MDS. .JUNE  1975.. 


Page  7 


The  formal  definition  of  the  C A CSALMOLEL  template  will 
appear  in  MDS  as  shown  below  {The  prefix  "TD  N -Template 
DafinitioN —  in  the  description  below  is  the  MDS  command 
taat  is  used  to  define  templates)  : 

CTEN:  CAUSALMODEL 

(causaln etdef n  (IT  CAUSALNETDEEN) 
causalnetdef nof ) 

(testdesns  TESTDESNS  testdesnsof) 

(classifications  CLASSDEFNS  classif icaticnsof ) 
(diseasedesn s  DISEASEDESNS  causalmodel)  ]. 

f  TDN  :  TESTDESNS  (ELEM  TESTDESN)  ] 
f  TEN :  CLASSDEFNS  (ELEM  CLASSDEFN)  ]« 

dare  •*  ( IT  CAUSALNETDEFN)  "  specifies  that  for  each 
•  a US ALMOCEL  a  new  instance  of  CAUSALNETDEFN  ought  to  be 
created.  "IT"  is  the  Instantiate  Template  command  of  the 
iloS  system.  The  word  "ELEM"  is  used  to  denote  "elements". 
In  MDS,  the  templates  with  ELEM  relation  are  used  to  define 
collections  of  objects.  We  shall  refer  to  a  template  with 
E^EM  relation  as  a  LIST  TEMPLATE.  Thus,  TESTDESNS  and 
CLASSDEFNS  above  are  list  templates.  A  template  line, 
CAUSALMODEL  (that  does  not  contain  the  ELEM  relation)  is 
called  a  NODE  TEMPLATE.  Instances  of  NODE  templates  are 
individual  objects,  where  as  the  instances  of  LIST  templates 
are  collections.  We  shall  later  see  the  use  of  flags  with 
templates  and  relations  to  classify  templates  and  identify 
vitiations  in  their  interpretations.  Ine  templates  given 
amove  specify  the  DESCRIPTION  STRUCTURE  of  tue  objects  they 
define.  In  the  description  structure  of  CAUSALMODEL, 


(wAUSALMODFL  causal netdof n)  is  said  to  be  a  DIMENSIONALLY 


JCoCkxPTION  OF  CA3NET  I N  MD5  ..JUNE  1975.. 


Page  % 


CONSISTENT  relational  form.  The  template  called  by 
(CAU3ALM0DEL  causalnetdef n)  is  CACSALNETDEFN.  In  the  case 
or  (CA  USALMODEL  testdasns)  the  called  template  is  TESTDESNS, 
waich  is  a  collection  of  TESTDE3N.  In  situations  like  this, 
we  shall  say  that  both  TESTDESH3  and  TESTDE3  N  are  the  called 
templates  of  the  anchor.  We  shall  refer  to  dimensionally 
consistent  relational  forms  like  (CA CSALMODEL  causalnetdef n) 
as  the  ANCHORS  of  CASNET.  The  description  or  an  instance  of 
CAUSALMODEL  will  be  said  to  be  complete  only  when  all  its 
rour  anchors  have  neon  instantiated.  Generally  speaking, 
tae  instantiation  of  an  anchor  might  call  for  an  instance  of 
tue  template  called  by  tne  anchor.  Thus,  the  instantiation 
or  (CAUSALHCDSL  causalnetdern)  might  call  for  an  instance  of 
Ch  USALN ETDEFN.  It  is,  of  course,  possible  that  not  any  ^ 

instance  or  C A fj SA L H ETDEFN  would  be  satisfactory.  To  take 
care  of  situations  like  this,  one  may  associate  with  the 
anchor  (CAUSALMODEL  causalnetdef n)  a  CONSTRAINT.  For  an 
instance  of  CAUSALMODEL ,  say  X,  an  instance  of 

CA USALN ETDEFN ,  say  Y,  would  be  accepted  as  the 
Causaln ctdef nof  X,  only  rf  Y  satisfied  the  constraint  or  the 
anchor  (CAUSALMODEL  causalnetdern).  One  may  tnink  of  the 
constraint  associated  with  an  anchor  as  defining  the 
semantics  of  the  relation  in  the  anchor,  Tnese  constraints 
are  specif- tu  in  HD S  in  first  order  logic.  i»e  shall  refer 
to  those  constraints  variously  as  CONSISTENCY  CONDITIONS 
(CC's),  or  SENSE  DEFINITIONS.  We  shall  later  discuss  in 


more  detail  the  rorms  and  interpretations  of  these  anchored 


J  EbEEl  s'TlON  OF  CASNET  IN  MD5..JUNE  1S75.. 


Page  9 


cunstra  ints. 

Thus,  TEMPLATES  and  their  associated  anchored 
E uNST PAINTs  are  used  in  MDS  to  describe  the  oujects  in  a 
domain.  The  description  of  the  CASNET  domain  in  MDS  will 
consist  of  a  whole  lot  of  different  templates  (lixe, 
JaUSALMCEEL,  CAUSALNETDEFN,  TESTDESNS,  I ESI  DEE N  ,  CLA  SSDEFNS , 
Z  uh SSDE  FN ,  STATEDESNS,  CAUSEDESNS,  STRATEGIES  ,  LIKEIIHCCD, 
etc.),  and  constraints  associated  with  their  various 
anchors.  These  templates  and  constraints  are  shown  in 
appendix  A.  Descriptions  of  tnis  kind  become  interesting  if 
tue  system  could  use  the  templates  ana  constraints 
a dtomatically  to  create  valid  instances  of  tue  objects  so 
described.  Thus,  from  the  template  for  CAUSALMGDEL  and  all 
its  component  templates  (and  components  tnere  of),  we  would 
ixKe  to  be  able  to  create  a  complete  instance  of 
w.\  U SALMCDEL .  In  this  instantiation  process  the  system 
snould  produce  wherever  possible  the  necessary  instances  of 
needed  objects  automatically,  and  where  appropriate  it 
snould  acquire  the  necessary  descriptions  or  needed  objects 
trout  the  user  (the  model  builder).  If  the  only  information 
available  consisted  of  templates  and  constraints,  then  this 
kind  of  instantiation  process  is  in  general  very  ditficult 
to  do,  No  (efticiont)  general  methods  »xist  r  or  constraint 
satisfaction  problems  of  t  his  typo.  In  M  Do  automatic 
instantiations  of  templates  and  anchors  ate  made  possible  by 
providing  additional  information  to  the  system  in  the  form 
or  control  procedures,  calloi  TdANSFuu.lAl  rON  KULEb  (Tit's). 


JciiJHit'TIOM  OF  CASKET  IN  MDS. . JUNE  1  J75 


Pago  10 


We  shall  later  discuss  the  forms  ana  interpretations  of 
Til's*  At  this  point  it  suffices  tc  mention  that  the  forms 
or  both  the  CC's  and  T? ' s  are  such  that  the  processors  in 
MDS  can  not  only  evaluate  them,  but  also  understand  what 
tuey  do.  Thus,  if  a  CONSTRAINT  at  an  anchor  is  not 
satisfied  by  a  given  object,  then  MDS  would  iuio«  the  reason, 
way.  It  may  use  this  reason  for  the  failure,  to  search  for 
a  more  suitable  candidate.  If  a  transrormati n  rule  is 
invoked  to  construct  a  new  object  or  complete  an  updating 
process,  the  MDS  can  generate  from  the  transformation  rule  a 
Description  of  what  it  expects  to  accomplish  by  the 
application  of  the  rule,  and  how  the  rule  goes  about  its 
process.  These  two  features  of  MDS  are  the  most  significant 
ones  that  enable  it  to  act  as  a  general  problem  solving 
system  that  can  do  non  trivial  problem  solving  tasks,  by 
making  use  of  the  domain  KNOWLEDGE  that  has  teen  described 
to  it.  We  may  now  briefly  summarize  the  concepts  introduced 
so  far  as  follows: 

The  concept  of  .TEMPLATES  and  the  process  of  their 
instantiation  are  central  to  the  operation  of  MDS.  Each 
iaMILATE  in  a  domain  is  associated  with  a  CLASS  or  objects 
m  the  domain.  Thus,  in  CASNET  there  is  a  class  of  objects, 
called  TESTDLSNs,  A  TEMPLATE  is  expecteu  tc  contain  in  it 
ill  the  specif ications  necessary  to  create  instances  of  the 
ji.ass  of  objects  that  it  defines.  Instances  or  TESTDESN 
-ill  be  descriptions  of  tests  th.it  may  be  applied  for  a 
given  disease  domain.  Thus,  in  GTES1S  (tests  tor  GLAUCOMA) 


Jc.  SCRxI'TlON  OF  CASNET  IN  MDS..JUNE  1975.. 


Page  1 1 


=  Jgt  1  qt2  ...  gtN),  each  gt  will  ba  an  instance  of 
TESTDESN,  and  will  describe  a  particular  TEST  that  might  be 
applied  to  diagnose  GLAUCOMA. 

The  information  provided  by  a  template  may  have  three 
parts  to  it:  The  STRUCTURAL  INFORMATION,  specifying  what 

ou-jects  in  the  domain  may  relate  to  what  ether  objects  and 
uy  what  relation;  SENSE  INFORMATION  (the'  CONSISTENCY 
CONDITIONS)  specifying  constraints  on  anchors;  and 
!u  A  NSF0  EM  AT  ION  RULES  specifying  the  procedures  to  be  used  to 
create  instances  (or  updates  of  instances)  cr  templates  and 
anchors  satisfying  all  the  specified  constraints.  Thus,  the 
CAUSALMGDEL  template,  together  with  ail  the  other  templates 
taat  it  calls  (and  others  that  the  called  templates 
tuemselves  (recursively)  call),  would  contain  the 
information  necessary  to  create  instances  of  CAUSA LMODELs . 
in  tne  descriptions  aoove  we  have  shown  only  the 
specifications  of  the  structural  components  of  CA USALMODSL. 
An  example  of  anchor  definition  in  a  template,  with 
indications  of  associated  CC  and  TR,  occurs  in  (STATEDESN 
presence)  (See  Appendix  A): 

(presence  (PRESENCE  TA)  presenceof  CClJ  TR  2)  . 

i .»  this  definition  TA  is  a  flag  associated  with  tn^  PRESENCE 
template,  that  declares  PRESENCE  to  be  a  "Terminal  Atom" 
template:  Every  instance  of  PRESENCE  should  uo  a  terminal 

atom.  For  our  purposes  we  shall  just  create  two  instances 
or  PRESENCE,  one  called  CONFIRMED  and  the  other  called 


J  So  Jttl  PTION  OF  CASNET  IN  MDS. .JUNE  1975.. 


Page  12 


JSNISD.  Thus,  one  nay  say  that  for  an  instance  of 

STATEL’ESN,  called  say  S,  the  value  cf  (S  presence)  can  be 

either  CONFIRMED  or  DENIED  (*)  .  Both  CONFIRMED  and  DENIED 
aie,  of  course,  atoms  (in  the  LISP  sense).  Ihe  labels  CC 1 3 
Hid  TR 2  respectively  refer  to  the  CONSTRAINT  and  IP  anchored 
it  (STATED ESN  presence).  The  CC  ana  TR  themselves  are 
defined  separately,  by  using  the  QSCC:  (Set  CC,  the  prefix 
*1  indicates  that  this  command  is  part  of  the  knowledge 
acquisition  system  of  MDS,  called  QUEST),  and  QSTR :  (Set 

Id)  commands,  as  snown  in  Appendex  A.  CC13  specifies  that 
tue  presence  of  a  state  should  be  CONFIRMED  if  its  status 
exceeds  a  certain  threshold,  it  should  be  DENIED,  if  the 

status  is  less  than  the  threshold.  In  the  beginning  the 

presence  of  a  state  might  have  the  value  UNKNOWN  (denoted  by 
a  ?,  in  MDS).  If  the  status  of  the  state  changes  during  the 
course  of  an  experiment,  then  its  presence  will  be 
automatically  set  in  accordance  with  CC13.  however,  in 
certain  situations  a  contradiction  mignt  arise:  Suppose  the 
presense  of  the  state  was  to  begin  with  CONFIRMED,  and  later 


*  This  is  not,  however,  strictly  true.  Because  of  a 
convention  in  MDS  (to  be  explained  la  tor),  tue  CC  1  3  applies 
to  an  instance  of  an  instance  of  STATEDE3N.  An  instance  of 
o I ATE D ESN  won  1 d  itself  be  a  template.  This  is  indicated  by 
tne  tiaq  "UN"  (.lota  Mode)  associated  witn  STA1EDE3N  template 
(joe  Appendix  A).  This  flag  indicates  that  <_very  instance 
Ji.  STATEJEFW  should  be  itself  a  NODE  TeMPLATz.  To  create  a 
model  lor  a  disease  domain,  D,  one  may  create  several 
instances  or  STATEDESN.  Tnese  would  then  be  the  disease 
state  templates  tor  the  domain,  D.  instances  of  these 
disease  state  templates  will  occur  in  die  causal  net 
description  ot  a  particular  occurrence  or  the  disease,  D,  in 
a  pati.Mit.,  CC  1  J  would  then  apply  to  the  states  •  occurring  in 
tue  causal  net  description  cf  this  particular  patient's 


jEJCuiPTION  OF  CASKET  IN  RDS..JUNE  1975.. 


Page  1.3 


aurinq  the  course  of  a  diagnostic  experiment  the  status  o£ 
cue  state  changed  as  a  result  of  new  evidence,  and  went 
below  the  threshold.  This  would  then  cause,  a 
w JNTRADICTION,  in  MDS.  The  new  situation  contradicts  the 
existing  situation.  To  resolve  such  a  constrauiction,  the 
i!ci2  associated  with  the  anchor  will  ue  invoxed.  We  shall 
liter  discuss  the  operations  performed  by  fH2.  One  may 
notice  that  the  CASNET  description  nas  only  about  six 
transformation  rules  in  all.  Also,,  each  Tit-rule  is  small, 
and  is  intended  to  do  specific  remedial  tasics  in  well 
constrained  situations.  The  major  repository  of  CASNET 
Knowledge  resides  in  TEMPLATES  and  SENSE  LEE INITIO  Ns.  The 
general  problem  solvinq  processors  built  into  MDS  enable  MDS 
to  accept  descriptions  of  this  kind  and  use  then  for  domain 
execution,  via  the  instantiation  process.  The  facilities  in 
lij  S  are  quite  general;  they  are  not  necessarily  restricted 
to  the  kinds  of  tasks  encountered  in  CASK  El.  Ac  the  moment 
i ue  are  using  MDS  also  to  describe  the  bELIEVEF  system  of 
Schmidt  and  Sridnaran  fSridhran  1  975  ].  MDS  is  a  META  system 
iu  the  sense  that  it  can  accept  descriptions  or  KNOWLEDGE  in 
awy  domain  and  specialize  itself  to  act  efficiently  in  the 
domain,  A  fairly  detailed  exposition  ot  tue  various 
zacilities  in  MDS  is  available  in  [Sririivasan  1975a],  A 
ioimiJ  description  ol  the  TEMFLATE  deiinition  system,  and 
me  various  typ^y  of  TEMPLATES  that  one  may  use,  are 
discussed  in  [Srinivasan  1W75b  ]. 


JdnChiPTION  OF  CASNET  IN  .IDS. .JUNE  197S 


Page  14 


After  introducing  the  description  structures  of  the 
jujects  in  CASNET  (i.e.  the  structure  ot  all  the  template 
rn  CASNET)  we  shall  nriefly  discuss  the  forms  and  uses  of 
various  types  of  templates,  the  way  relation  flags  are  used 
ra  HDS  to  specify  variations  in  the  interpretation  of 
anchors,  and  the  forms  and  interpretations  of  CC's  and  TR ' s 
in  CASNET.  These  will  enable  the  reader  to  follow  the 
formal  description  in  Appendix  A  more  closely.  We  shall 
follow  this  with  detailed  explanations  of  selected  portions 
or  CASNET,  with  illustrations  of  their  use. 

It  should  be  pointed  out  that  tne  HDS  system  as 
described  here  is  not  yet  fully  operational.  Only  the 
uONAIN  DEFINITION  part  of  HDS  and  a  small  part  of  the  D CHAIN 
EXECUTION  system  (the  system  that  interprets  templates, 
instantiates  them  and  evaluates  the  CC's  ana  TR's)  are  noi 
operational.  The  development  is  net  yet  at  a  stage  where  it 
can  execute  the  CASNET  description,  presented  here.  The 
purpose  of  this  report  is  to  introduce  the  HDS  facilities  to 
investigators  in  the  area  of  medical  modelling,  in  the 
autgors  Resource. 

3.0:  The  DESCRIPTION  STRUCTURE  of  CASNET. 

We  have  already  discussed  the  structural  description  of 
%.aUSALM  ODEL  shown  in  tigure  1,  below.  In  tnis  section,  we 
snail  present  the  structures  of  tne  components  of 


Uii'TI'JN  OF  CAS  NET  IN  HDS.  •  JUNE  1975.. 


Page  15 


EA  USALMO  DEL ,  that  aie  used  in  CASNii’  description.  The 
structures  presented  here  are  intended  to  convey  a  general 
understanding  tor  what  a  CASNET  is.  These  structures  have 
ween  taken  directly  from  the  templates  defined  tor  CASNET. 
i'ue  branches  in  the  diagrams  oelcw  are  la ue lied  with  the 
relation  names  used  in  the  templates,  and  have  indications 
mere  cn,  for  the  CC  *s  and  TR  1  s  associated  with  the  various 
relations.  The  two  letter  labels  within  [..1  appearing  as 
suffixes  of  template  names,  indicate  the  types  of  the 
respective  templates.  These  are  explained  in  greater  detail 
a.u.  section  4.  Fcr  the  moment  the  reader  shculu  xeep  in  mind 
me  following: 

K5J :  Every  instance  is  itself  a  MODE  template. 

PM:  NODE  template.  Every  instance  saould  nave  a  name. 

3N :  NODE  template.  Names  optional  ror  instances. 

SL:  LIST  template.  Names  optional  tor  instances. 

XI:  Every  instance  is  an  INTEGER. 

I#:  Every  instance  is  a  NUilBER. 

TA:  Every  instance  is  a  LISP  ATOIi. 

TS:  Every  instance  is  a  STRING. 

Tne  structural  view  presented  here  is  precisely  the  view 
available  to  the  system  from  the  templates. 

In  the  templates  shown  in  Appendix  a,  the  relations  at 
cue  vaiious  anchors  have  at  times  certain  flags  associated 
with  tnem.  These  flags  are  aiot  shewn  in  tue  structures 
suown  in  this  section.  In  discussing  informally  the 
interpretations  of  the  structures,  we  snail  sometimes  refer 
to  the  flags  associated  with  the  relation  in  Appendix  A.  A 
detailed  discussion  of  the  use  of  reltion  ilago  appears  in 


OiiidalPTION  OF  CAS  MET  IN  MDS..JUNE  1  975.. 


Page  16 


CAUSALMODELf  RN  1 

I 

j  --causalnotdef n - ^CAUSAL NETDEF  tl(  SN  ] 

| — testdesns - >TESTEE5NS[iL  ] 

|  — classif ications->CLASSDEFNS[  oL  ] 

(--diseasedesns - />D1S£ASEL  .  ?N  S[  il  ]. 

FIGURE  1:  The  structure  of  CAUSALMOCEL. 

The  structure  of  CAUSA  LNETDEFN  is  shown  in  Figure  2. 
instances  of  CAUSALNETDEFN  need  not  have  names  associated 
with  them  (it  is  a  j>N  template).  It  has  seven  components. 
Notice  that  one  of  its  components  is  the  CA USAL MGOEL,  Thus, 
tor  each  CAUSALMOCEL  there  is  exactly  one  C AUSALN ETDEE N  and 
ror  each  CAUSALNETDEFN  there  is  exactly  one  CAUSALMOCEL. 

The  common  threshold  defines  the  threshold  that  is 
common  to  all  instances  of  STATEDESN.  Tuis  common  threshold 
will  be  assumed  to  be  the  threshold  cr  a  <STATE>  (an 
instance  of  STATEDESN) ,  unless  a  seperate  threshold  had  been 
declared  for  the  state.  This  condition  is  specified  by 
CC67,  which  is  the  constraint  associated  with  the  anchor, 
(oTATEDESN  threshold),  appearing  in  figure  2. 


UPTION  OF  CA3NET  IN  HUS..  JUNE  1975.. 


Page  17 


CAUSALN FTDSFN[  $N  ] 

I — causalnetdef  nof - sC  A  IS  ALHODLLf  RN  ] 

|--co'>imonthrcshold - >T  H  0  E  SiiO  L  Lf  TI  ] 

|  -statedesns - >STATEDLSN3f  31,  ] 

|--causedesns - PCACSEDESilSf  iL  1- -elements 

I  I 

I  CAUSEDESN[  (IN  ] 

I  I 

l  STAT2DE  SN[  KN  ]< - state - ( 

I  PR CE[ T# ]<-CC 19 .transit ionprok- | 

I  — starting states, CC1 - >3  PATEL  ESN S[ Si,  ] 

I — term  inalstat.es, CC2 - >ST  ATEDESKSf  3L  ] 

I  — inte  riorstat es,CC2 - >ST AT EC ESN Sf SL  ] 

I --designatedstat.es, CC  4-->STAT£DEShsj;  IL  j 

elements 


STATEDES Nf  HN  ] 


-startingweiqht,CC7 —  >FFOB[ T <  ] 

-  descend  ent.s,CC8 - >S1  ATELLSNSf  $L  ] 

-causes, CC10 - >C  A  US  ECESN  3f  iL  ] 

-threshold, CCo7 - >TH  RESHOLUf TI  ] 

-status - >STATUSf  Ti  ] 

-conflict - > CONFLICT^  Ti] 

-presence,CC13,TR2  — >P3ESENCEL  TA  I 

-livelihood - >LIKEL IHGODi  *N  ] 


totalinverseweight,CC57, TRb-->lROBS[ 3L  ] 


PR0  2S  |-prouability,CC53 - >PRCBfT#  ] 

I  J -for war dweight,  CC54, TE3 - ✓PROEf T#  ] 

elements  |  -  total  weight  ,CC  5  5,  TR  4 - >i?ROE[  T*  J 

I  I -in  verseweight  ,CC3  1  ,TR5 - >v,ONDtr?CDr  $N] 

PRO  Df  T  #  ]  | 

I 


<  STATE  >  <--CC  9. 3,  causest  a  to - I 

<  STATEX--  CC90  ,en  ectstat  1 

PE0  3[  T#  ]<--CC5o,  pro  nubility - | 

C0NDPP03[  $N  ]<--CC^2  ,  ilex  tpLOl) - | 


FIGURE  2: 


Structure  ot  C  AUS  A  L  N  H  r  Dui  U 
instance  oi  STATE:  DKSN. 


<  SPAT E>  is  an 


m 


V-  ■ 

t 

h 


j£S„iUl'TION  OF  CASNET  IN  HUS. .JUNK  1975..  Page  18 

Principally,  each  CAUSALNETDEFN  contains  a  collection 
STATEDZSNs,  and  CAUSEDESNs,  Each  CAUSELESh  has  a  state,  and  a 
transition  probability.  Each  CAUSEDESN,  x,  is  in  fact,  the 
description  of  a  link  of  the  form  shewn  below: 

- (x  transitionprob) - >  (x  state). 

When  an  instance  of  CAUSALNETDEFN  is  created,  the  system 
will  prompt  the  user  for  supplying  its  statedesns.  The  need 
tor  this  prompting  is  indicated  by  associating  the  prompt 
rlag,  !,  with  the  statedesns  relation  (we  shall  discuss  the 
various  relation  flaqs  and  their  uses  in  section  4,2).  The 
startingstates,  terminalstates,  etc.,  oi  a  CAUSALNETDEFN  are 
defined  oy  CC1 ,  CC2 ,  etc.  Cnee  the  stateaesns  arc-  defined, 
tnen  the  system  wilL  automatically  i ind  and  assign  the 
starting,  terminal  anu  interior  states  oi  the  CAUSALNETDEFN 
using  the  constraint  definitions  cCI,  CC2  and  CC3 
respectively,  if  and  when  they  are  needed.  Similarly,  it  will 
use  CC8,  anchored  at  (SIATEDESN  causedesns) ,  to  find  the  cause 
^ascriptions  from  the  definitions  of  the  state  descriptions, 
iue  constraint  CC4 ,  for  designatedstates,  is  such  that  the 
system  can  use  it  only  to  check  whether  a  given  vSTATE>  is  an 
appropriate  candidate  for  being  a  designatodstate;  if  no 
candidate  state  is  given  then  CC4  cannot,  by  itself,  find  the 
appropriate  candidates  (these  features  are  discussed  in 
greater  detail  in  section  4.3).  To  finu  the  designatedstates 
or  a  CAUSALNETDEFN,  the  system  will  therefore  prompt  the  model 


builder 


wEEJui  i'TION  OF  CASNET  IN  MDS..JUNE  1975.. 


Page  19 


Each  STATEDESN  has  eight  components.  Eoi  each  instance 
01  STATELESS ,  called  say  <STATE>,  each  or  these  eight 
components  will  be  instantiated.  The  relations. 


sta  rtin  gwe 

iqht. 

causes,  and  threshold 

have 

prompt 

flags 

associa  ted 

with 

them.  Therefore,  for 

an 

instance 

of 

STATEDESN , 

MDS 

will  prompt  the  user  rot 

tne 

values  of 

these 

relations,  which  should  be  respectively,  instances  of  NUMBER, 
CAUSEDESNS,  and  INTEGER.  The  descendants  or  a  STATEDESN  is  an 
instance  of  STATEDESNS.  This  will  be  computed  by  CC8  from  the 
definitions  of  the  "causes"  of  the  STATEDESN. 


The  relations,  status,  presence,  conflict  and  likelihood 
or  a  STATEDESN,  have  the  flag  "C"  (f  cr  CONSTANT)  associated 
with  then.  Because  of  this  flag,  the  called  teapiated  of  the 
anchors  containing  these  relations  are  treated  as  constants. 
iuus,  in  an  instance,  vSTATE->,  of  STATEDESN,  (<3TATE>  status) 
will  also  be  the  STATUS  template,  the  same  template,  called  by 
(STATEDESN  status).  Similarly,  we  will  nave  (<STsTE>  presence 
PRESENCE) ,  (vSTATEP  conflict  CONFLICT)  and  (^STATE>  likelihood 
niKELIHOOD) . 


<STATt>  itself  will  be  a  template,  since  STATEDESN  is  a 
t'lii  template.  In  a  further  instantiation  ol  <STATF>,  the 
anchors  of  sSTAT  E>  that  can  be  again  instantiated  will  be 
t.ioso,  whose  called  objects  are  templates.  In  the  case  of 
v.STATE>  the  relations,  "sta  r  ting  we  ight" ,  "causes"  and 
"threshold",  will  call  respectively,  a  number  (instance  of 
PROS) ,  an  instance  of  CAUSEDESNS  (whicn  will  be  a  collection) 


9 


JEoS&ii'TrON  OF  CASNET  IN  MDS..JUNS  1975.. 


Page  20 


and  an  integer  (instance  of  THRESHOLD),  These  called  objects 
ace  not  templates.  Therefore,  in  a  further  instance  of 
<SlkI£s ,  say  x,  these  relations  cannot  be  again  instantiated, 
in  situations  like  this,  for  an  instance  of  STATEDESN,  like 
^ST  AT  E-> ,  the  values  of  the  relations  (<STATE>  startingweiqht) , 
(<STATE>  causes)  ,  and  (<STATE>  threshold)  will  have  the 
ullowing  significance: 


For  all  instances,  x,  of  <STATE>,  (x  starting  weight) ,  (x 
causes)  and  (x  threshold)  will  be  the  same  as  (<STATE> 
start ingwe ight) ,  (<STATE>  causes)  and  (<STATE> 
threshold),  respectively. 


Thus,  if  <STAIE1>  and  <STATE2>  are  two  instances  of 
i I ATEDE  SN ,  these  two  may  have,  for  example,  different 
scartinqweights.  However,  all  instances  cf  <STATE1>  will  have 
cue  same  startingweiqnt  as  <STAIE1>  itself,  and  similarly  with 
<*JTATS2>.  Now,  for  both  <STATZl>  and  <.SIATE2>,  it  will  be 
true  that  (<5TATS1>  status  is  STATUS)  ana  (<STATE2>  status  is 
STATUS),  "STATUS"  here  is  a  template.  Therefore,  instances 
or  <STATE1>  will  have  as  their  status,  instances  of  STATUS. 
Tnus,  different  instances  of  <STATE1>  may  have  different 
statuses.  In  fact,  for  instances  of  a  <STATE>  the  only 
rustatiated  relations  will  be:  status,  presence,  conflict  and 
likelihood.  Different  instances  cf  a  <S TAT E>  may  vary  in  the 
values  the”  have  for  these  relations. 


The  remaining  structures,  the  structures  of  LIKELIHOOD 
aud  C0NDPF08,  shown  in  Figure  2,  ate  seif  explanatory.  We 
shall  have  more  to  say  about  the  interpretation  of  these 


t<i  t*T  ION  OF  CASNET  IN  MDS..JUNE  197 j.. 


Fage  21 


jiructuces  later  in  this  report.  Tne  structures  for 
TiSTDi-SNS,  CLASSDEr’NS  and  i)IS  EASE  EE3N  are  shown  in  figure  1,  4 
iad  5  respectively.  The  reader  is  inviteu  to  scan  through 
cause  before  proceeding  with  the  rest  of  tne  paper. 


For  an  instance  of  TESTDESN,  say  <TE31>,  tne  anchors  for 
tne  relation  names,  summar  yyuestion,  repeatability,  cost, 
confidence,  negativedeterminacy,  counter,  enacts,  nexttest, 
firsttest,  nega ti ve ne xttest ,  and  testtypo  would  all  be 
instantiated  to  constants  (items  that  are  not  templates,  or 
xtems  which  are  templates  but  which  are  tc  be  treated  as 


TESTDESNST  -SL  1 

I 

|  -strateq  ies-->STRA7EGIES[  $L]— elements— >S1’R  ATEGYf.  RN  ] 
I  I 

|  INFLUENCEl  33  ]< — influences  —  | 

eloiue  nts 


TESTDES  N[  RN  ] 


-compononts,CC21 - 

-summaryguestion, CC23- 
-repeatability , CC24 — 

-cost, CC23 - 

-confidence , CC26 - 


->T  ESTEES  NSf SL  ] 
->QUE5iruNr  IS  ] 
->YESNO[  TA  ] 
->cosr[T.f  ] 
^CONFIDENCES  II  ] 


-negativedeterminacy, CC27 — >Y  ESNO[ IA  ] 

-counter, CC28 - >CCUNTER[  TI  ] 

-effects, CC29 - >E  F  F  ECTS{  SL  ] elements 


-firsttest, CC3G - >TESTDZ3N|  n  N  ] 

- nex t test , CC 3 1 - >TESTDE33i  Eh  ] 

-nega ti  vonex ttest ,CC32 — --->T ES TLZSNf  ill.'  ] 

-  testty  pe,  CC20 - >  TESTTYp  cf  TA  ) 

-costra  tio,  CC  33 - - - >COSTF>  AT  i0{  I  a  ] 

-  testrosult,CJ34,TR1 - >Y  E  S  N  C[  TA  1 

-application - >AFFLICAIICIif  S3  1 


1 

Ei t  SCT[  SN  ] 

I 

|  -now  —  >HOW[  TA  ] 
] -at  feet  ed 
state 


SI  AILUESN[  MN  1 

I 

|  -can  di da  testates, CCS U HST  AT  £> 

(  -  can  dida  tet.ests  ,CC5  1 - ovif.sTo 

(  -nextcho ice,  CC52 - >vTLSTo 

l-applicationof ,CCfc5- - ><.TLSI> 


UESJalPTION  OF  CAS N ET  IN  MDS...7UNE  1975..  Page  22 


FIGURE  3:  The  structure  of  TESTDESNS.  <TES'I>  is  an 
instance  of  1ESTDE5N. 


constants) .  Thus,  when  a  <1EST>  is  instantiated ,  as  part  of 
tne  process  of  building  a  DI5EASEDESN,  only  the  anchors, 
( <. TEST>  costratio)  ,  (<TEST>  application)  and  (<TEST> 
testresult)  will  be  instantiated.  In  the  process  of 
instantiating  the  testresult  the  appropriate  question  for  the 
test  will  be  asked  by  a  funticn,  called  A5KQUESTI0N.  The 
result  of  the  test  will  then  interact  with  the  affected 

4 

states,  and  cause  a  whole  chain  of  caanges  to  take  place  in 
cue  current  model  of  the  disease,  consistent  with  the 
semantics  of  the  pertinent  CC’s.  These  changes  will  be 
completely  guided  by  the  CC's  involved,  with  the  TR*s 
providing  some  of  the  crucial  control  structures.  He  shall 
examine  this  process  more  carefully  in  section  5. 


CuASSDEFNSf  $L  ] — elements — >CLASSDSFN[ $N  ] 

I 

|-classtype,CC15 — >CL AS SNA ME[ TA  ] 
)-f  irstentry - >ENTRYDEFN[  $N  3 


STATEDE5  N[  UN  ]<  —  -CC70 , ent  r ystate-  | 
STATEDESNS[  3 L  ]< — CC17  ,u3scenu ants-  | 

C0.1KENfS[  3 L  j< - comments-  | 

ENTRYDSFNf  $N  J< - no  x ten  try-  | 

ENT3YDEFN[  5k  ]<-CC Id , lower entries- I 

ZO  KMENTSf  SL  ]— elemen  ts- ->C0HHENT[  $N  ] 

I 

|  -d  iag ncsis-->STA  IE(iENT[  TS  ] 

|  -  therapy - ✓STAILilENTf  TS  j. 


FIGURE  4:  The  structure  of  CLASSDEFNS 


UEj-iUPTION  OF  CASNET  IN  KDS..JUNE  197b.. 


Page  23 


Each  CLASS D EFN  has  a  classtype,  which  is  a  CLASSNAME,  anil 
an  E>J  TR  YDHF\T ,  which  is  its  firstentry.  The-  system  will  prompt 
ror  both  of  these  when  a  CLASSDEFN  is  instantiated.  Each 
SNThYDEFN  has  an  entrystate,  a  collection  of  descendonts 
(which  would  be  the  entrystates  of  the  lowerentries  of  the 
lihTRYDEFN.  The  comments  associated  wxth  an  LNT3YBEFN  would 
specify  the  diagnosis  and  therapy  ror  tne  entry.  The 
nextentry  of  an  2NTRYDEFN  is  again  an  2  NThY  DEFN .  The 
lowerentries  are  the  collection  .of  ENTRYdiFNs,  under  the 
closure  of  the  nextentry  relationship  (lowerentries  is  a 
transitive  relationship).  It  is  also  reflexive.  Hence  an 
2I»TPYD£FN  is  by  definition,  its  own  lowerentry. 


DISEASEDESNf  $N  1< - elements - DISE A5EDESN5[ SL  ] 

I 

I  -causalmcdel - ^CAUSA  L’lCDEhl  UN] 

(-date - >  (DATE)  | — sex — >SEX£TA] 

j  -aiseaseof - >PERSON[  RN  ] - >  l-years->YE  AR3[  TI  ] 

i  -diag  rosi s/therapy  ,CC37 — >C0HM2N1  S[  $L  ] 

i  -toplevel test - >TOPLEVELTEST[  3 N  ] 

|  -causalnet - >CAUSALNET[  iK  ]  I 


T F3TDESNS[ $L  ]<-CC39 ,currenttests- 
<TEST>s<-CC103, select  eutests- 
TOPLEVELTE  SI<-CC  101  ,u--‘SCenue  nts- 
TOPLEV ELTE3TC-CC 10 2, ancestor S- 
TOPI.E  VF.LTESf  <-CCb  ,neXttest - 

|  -causalne  tor - >DI SEA EEDEShf  IN  ] 

|  -  states,  CC4 0 - XSTAT  E>s 

|  -  causes,  CC4  1 - ><CAUSE>s 

|  -  tests, CC  10  0 - XT  F  3T>s 

|  -  star  ting  states,  CC4  2 - XSTAT  F>s 

|- terminal  states, CC4 3 - X5  1A  I E>s 

|  -  into  riorstateo,CC44 - ^vlTATE>s 

|-irlstdrtingstatcs,CC4b--X3:'ArE>s 

| -path  ways - >PAT  HWAY[  *N  1 

I 

<STATE>*v  —  star  tiugst  ate,  CC4 1:--( 

<3  r  A T2->s — com  pone n  ts  ,C«24li - 1 

<P  ATH  Vi  A  YX--  next  pa  th  wa  y,  CCt,t> - | 


JiiaJttli’TION  OF  CASNET  IN  MDS,  .JUNE  1975.. 


Page  24 


FIGURE  5:  The  structure  of  DISEASELiESN. 

Each  EISEASEDESN  has  a  CAUSALMOEEL.  It  is  made  on  the 
indicated  date,  supplied  by  the  (DATE)  function.  The 
^inscription  is  the  diseaseof  a  PERSON,  ana  nas  a  causalnet 
associated  with  it.  The  construction  of  the  DISEASEDESN  will 
ungin  with  the  TOPLEVELTEST,  which  would  from  tnen  on  continue 
tafc  testing  process  via  its  own  nexttest,  which  is  again  an 
instance  of  TOPLEVELTEST.  Each  time  the  currenttests  of  a 
TOPLEVELTEST  is  instantiated,  *a  collection  of  possible 
currenttests  will  get  selected.  From  among  these  the 
"selected test"  of  the  TOPLEVELTEST  will  be  cnosen.  If  there 
is  only  one  currenttest,  then  obviously  tnis  will  get  chosen 
as  the  selectedtest .  If  mere  than  one  currenttest  exist  then 
tne  system  will  prompt  the  user  for  advice  about  which,  among 
cue  equally  feasible  collection  of  currenttests,  should  be 
applied  in  the  existing  context.  The  selectea  test  (or  tests) 
Wull  get  instantiated,  and  as  a  result  will  get  applied.  The 
application  of  these  <TEST>s  will  cause  the  appropriate 
cnanges  in  the  CAUSALNET  of  the  EISEASEDESN.  This  process 
will  continue  till  the  nexttest  is  NIL.  At  this  point  the 
dragnos is/thera py  will  be  in sta ntia tea .  The  associated 
COMMENTS  may  be  printed  out. 

An  instance  of  CAUSALNET  will  have  to  begin  with,  a 
complete  set  of  instances  of  the  <STATE>s  ana  <.CAUSE>s  of  the 
associatea  CAUSALNETDEFN.  The  startingstates,  terminalstates, 
etc.  of  this  instance  of  CAUSALNET  will  be  determined  during 


oEJCrfiPTION  OF  CAf.NET  IN  MDS..JUNE  1975.. 


Pa <jo  25 


cue  course  of  test  applications,  by  the  CC's,  CC42,  CC43  and 
CC44.  The  mlsta rtinqstates  of  a  CAUSALNET  is  tne  collection 
or  startinq  states  from  which  the  largest  number  of  CONFIRMED 
states  in  the  causalnet  are  reachable.  The  PATHWAYS  of  a 
ChUSALNET  will  be  the  causal  chains  in  the  net  which  have  all 
tueir  states  CONFIRMED,  and  for  which,  none  or  the  ancestors 
or  their  states  are  DENIED.  These  PATHWAYS  are  used  to 
produce  the  diag nosis/t hera py. 

< 

This  completes  the  structural  description  of  the  CASNET. 
To  understand  the  details  cf  the  structural  description  as 
specified  by  the  templates  in  Appendix  A,  it  is  necessary  to 
lUiOv  the  significance  of  the  various  template  and  relation 
riaqs,  and  also  the  use  of  the  so  callea  function  template. 
To  understand  the  way  these  templates  are  used  ana  interpreted 
it  is  necessary  to  jcnov  the  meaning  and  interpretations  of  the 
CC's  and  TP's  associated  with  the  various  ancnors.  These  are 
ixl  described  in  the  next  section. 

4.0:  Templates  and  their  instantiations. 

4. 1:  Types  of  templates  and  examples  cr  their  use 

in  CASNET. 

by  instantiating  a  TET1  PLAT v  in  Mbs,  oi.-  gets  an 
instance  of  t;ie  object  specified  d  y  tin  template.  a 
complete  insta nt rat  ion  cf  a  template  would  call  for  the 
instantiation  of  all  its  anchors.  There  are  various  t.yp^s 
of  templates  in  MDS.  We  have  already  seen  the  distinction 
octween  the  NODE  template  and  the  LI  31  template. 


JELjaiRTIDN  OF  CASNET  IN  MDS.. JUNE  1975.. 


Paqe  26 


i 

i 

i 

i 

I 

<4 

i 

J 


CAUSALMODEL  is  a  NODE  template,  whereas  TESTDESNS  is  a  LIST 
template.  For  convenience  we  shall  require  that  every 
instance  of  CAUSALMODEL  should  have  a  NAME.  This  is 
indicated  by  the  flaq  "RN"  (Regular  Node)  associated  with 
tue  CAUSALMODEL  template.  If  the  naming  ol  instances  of  a 
template  is  to  be  left  optional,  then  the  flay  "iN"  would  be 
associated  with  a  NODE  template.  Thus,  CAUSALN ETLE FN  is  a 


(Dummy 

Node) 

template. 

Net 

every  instance  of 

L  A  USA L  N  ETDEFN 

need 

have 

a  name. 

An 

unnamed  instance  of 

LA  USALN  ETDEFN 

may 

be 

accessed  only 

via  the  CAUSALMODEL  to 

w a ich  it  is  related 

to. 

For  exampl 

s. 

the  C  AU3 ALN  ET  DEFN  for 

GLAUCOMA  can 

be 

accessed  only 

by 

the  phrase  (GLAUCOMA 

causaln  etdefn) 

,  if 

the 

instance  of 

the 

CAUSALN  ETDEFN  itself 

was  unnamed.  Similarly,  we  may  have  also  REGULAR  LIST  (RL) 
and  DUMMY  LIST  ($L)  templates.  He  shall  indicate  the  type 
or  a  template  by  the  ilag  associated  with  it. 


We  saw  several  examples  of  TERMINAL  templates  in 
section  3.  Templates  like,  THRESHOLD  (Terminal  Integer), 
PsGB  (Terminal  Number),  STATUS  (Terminal  Integer),  PRESENCE 
iTerminal  Atom),  STATEMENT  (Terminal  String),  etc.,  are  all 
i&RMINAl  templates.  Instances  of  TERMINAL  templates  in  MDS 
would  be  the  primitive  data  types  oi  the  system.  In  MDS, 


| 


I 


TEMPLATE  itself  is  one  cf  its  primitive  data  types.  It  is 
tuo  only  instmtiiblo  data  type.  Thus,  we  can  have  TERMINAL 
uMPLATE  templates,  whose  instances  are  tnemselves  required 

l 

to  be  templates.  One  may  think  of  a  TERMINAL  TEMPLATE  as  ' 


ipocifying  the  schema  tor  defining  a  template 


iuSCttiPTlON  OF  CASNET  IN  MDS..JUNE  1075 


Page  27 


We  have  encounered  several  TERMINAL  TEMPLATES,  the 
templates  S7ATEDE3N,  CAUSEDESN,  TESTDES  N ,  etc,  are  special 
cases  of  TEPMINAL  TEMPLATES.  They  are  all  TERMINAL  NODE 
templates.  Instances  of  these  are  required  to  be  themselves 
NODE  TEMPLATES.  They  could  be  either  i.  EGUL  AR  NODE  or  DUMM  X 
NjDE  templates.  Thus,  in  GLAUCOMA,  the  instance  of 
1'c.STDESNS,  which  is  a  collection  of  instances  of  TLSTDESN, 

3 TESTS  =  (qt  1 ,  qt2 , . . . ,qt  N)  , 

iich  qt  would  itself  be  a  NODE  template.  It  would  define  a 
particular  test  applicable  to  GLAUCOMA.  Instances  of  qt 
u\ i q ht  be  used  tor  tue  diaqncsis  of  particular  occurrences  of 
the  disease.  In  the  moael  establisnment  process,  the 
various  terminal  templates  (like,  TESTDESN,  CAUSE DESK, 
STATED  ESN,  etc.)  are  used  t.o  create  descriptions  of  CTESTos, 
<C A  US  AL-LINK>s,  <STATE>s,  etc.,  that  pertain  to  a  given 
disease  domain.  Since  these  descriptions  are  themselves 
templates,  they  may  further  be  instantiated  to  qenerate 
DISEASEDESNs  for  particular  occurrences  of  the  disease. 

The  few  other  templates  of  interest  to  us  in  this  paper 
are  the  UNIVERSAL  TEMPLATE,  **,  and  FUNCTION  TEMPLATES. 
Anything  can  be  instance  of  a  UNIVERSAL  template.  Thus,  a 
w.»  U  SALE  0  CEL  ,  CAUSA LN FTDE FN ,  ST  A  TED  ESN  ,  etc.,  may  all  be 
viewed  as  being  instances  of  **.  We  shall  see  uses  cf  **  in 
CrtSNET  description.  A  function  template  is  used  to  define 
t unctions  about  which  MDS  would  know  some  properties.  The 
principal  properties 


ct  a  function  defined  in  a  function 


OSoEaii'TlON  OF  CAS  NET  IN  M  DS . .  JUNE  1975.. 


Page  28 


template  are:  What  the  arguments  cf  a  function  can  be,  what 
its  result  is  and  how  the  arguments  and  the  result  relate, 
i’ue  process  of  instantiation  of  a  function  teaipiate  is  the 
sane  as  the  evaluation  of  the  function  fcr  given  arguments, 
i’ne  result  of  the  instantiation  would  be  th«  result  returned 
uy  the  function.  As  an  example  of  a  function  template 
consider , 

[TDN:  ADD  (FNDEF  HUMBER  NUMBER  NUMBER  ], 

wuich  is  the  definition  of  tne  function  template,  called 
AoD.  It  has  two  arguments,  both  of  which  are  NUM3FRs,  and 
has  a  result  which  is  also  a  NUMBER.  The  last  item  in  the 
r'NDEF  of  a  function  template  is  always  the  definition  of  the 
result  of  the  function.  In  the  case  of  ADD  no  constraints 
nave  been  defined  relating  ts a  result  of  the  function  to  its 
arguments.  This  relationship  will  be  estaDlisuwd  by  the 
procedure  for  ADD  associated  with  the  ADC  function.  In  this 
case  K  DS  itself  would  not  know  anything  about  this 
procedure.  We  have  another  example  of  a  function  template 
in  SUM  defined  below: 

[TDN:  SUM  (FNDEF  ( L1STDT  CC99)  (**  CCoO)  )  J. 

In  this  case  the  argument  of  SUM  is  an  instance  ot  LISTDT  (a 
i*  i S ?  iata  type--LISP  LIST--  to  be  contrasted  with  a  LIST 
template)  ,  that  satisfies  the  constraint  CC'>9.  This 
constraint  specifies  that  the  elements  of  the  LISTDT  should 
ail  be  instances  or  either  INTEGER,  or  NUMBER,  or  a  template 


UESJRIfTION  OF  CAS MET  IN  MDS..JUNE  1975.. 


Page  29 


wuose  template  flag  is  TI  or  T#.  The  result  can  be  anything 
(instance  of  **)  that  satisfies  the  constraint  CCtO.  CC60 
specifies  that  tne  result  is  a  NUHbEH,  INTEGER,  or  an 
instance  of  a  TI  or  T#  template  depending  on  the  olements  of 
cue  LISTDT  argument.  If  there  exists  a  TI  or  T*  template  X, 
such  that  all  the  elements  of  LISTDT  are  instances  of  X, 
tnen  the  result  is  also  an  instance  of  X.  If  there  exists 
an  instance  of  NUMBER  in  the  LISTDI  then  the  result  is  a 
number,  Otherwise,  the  result  is  an  INTEGER.  ND5  has  more 
information  about  SUM  than  it  has  about  ADD.  Cue  may  use 
tnese  function  templates  in  the  definitions  of  CC • s  and 
Til's,  and  may  also  use  them  as  the  called  templates  in 
anchors.  The  function  IT  (Instantiate  Template)  is  often 
used  as  the  called  template  in  CASNET  description.  Thus,  in 
tne  S1ATEDESN  template  (see  Appendix  A),  the  called  template 
for  (STATEDESN  likelihood)  is  (IT  LIKELIHOOD) .  This 
indicates  that  for  an  instance  cf  STATEDESN  called,  say 
STATE1,  (STATE1  likelihood)  should  be  a  new  instance  of  the 
LIKELIHOOD  template  created  by  (IT  LIKELIHOOD) .  The 
function  template,  IT  itself  has  the  definition: 

[TDN;  IT  (FNDEE  (TEMPLATE  TEMPLATES)  (NAME  NAMES) 

(MDNH  MDNHS)  )  ]. 

f ne  first  argument  (arql)  of  IT  can  be  either  a  TEMPLATE  or 
x  E  M  FL  A  T  E  5  (collection  of  TEMPLATES).  Tue  second  one  cane 
in,  similarly  a  NAME  or  a  collection  oi  N AM  Is.  The  result 
is  a  data  type  called  MDNH  (Model  Definition  Header,  a 
pointer),  or  i  collection  of  MDNH  's.  Every  instance  of  a 


jt ddhiPTION  OF  CASNET  IN  MDS..JUNE  1975.. 


Page  30 


template  in  MDS  will  have  a  model  definition  header 

associated  with  it.  The  instantiated  model  will  have  the 

name  given  by  the  NAME  argument.  Ir  no  NAME  is  given,  or 
tue  name  NIL  is  given,  then  the  instantiated  model  will  have 
no  name. 

When  using  a  function  template  as  the  called  template 
one  can  use  a  CC  to  bind  the  argument  tor  tne  particular 
function  call  of  the  function.  Thus,  the  called  template 
for  the  anchor  (APPLICATION  nextchoice)  in  Appendix  A,  has 
seen  defined  to  be  (IT  (?  CC64)).  The  argument  (?  CC64) 

i s  to  be  read  as  "anything  that  satisfies  CC64",  where  the 
constraint  CC64  actually  specifies  the  algorithm  for  the 

selection  of  the  test  for  the  nextchoice  m  a  diagnosis 

process. 

The  various  types  of  templates  introduced  so  far  are 
summarized  in  Figure  6  below  (not  all  the  types  of  templates 
available  in  MDS  are  shewn  here). 


J.UPTIQN  OP  CAS  NET  IN  M Do . . JUNE  1975.. 


Page  3i 


I 

I 

NODS 

I 

I 

I 

I 


I 

i 

ft  2  G  J  u  A  R 
ii 


I 

I 

DUMMY 

2N 


TEMPLATE 


I 

I 

LIST 

I 

I 

I 

I 


I 

I 

types - 

I 

I 

FUNCTION 

I 

I 

I 


variations 


I 

I 

PRIMITIVE-DATATYPE 
[ INTEGER, NUMBER, 

ST  a  I NG, TEMPLATE, 
NODE, LIST, etc. . 


I  i 

I  I 

REGULAR  DUMMY 
RL  $L 


I  1 

I  i 

REGULAR  DUMMY 

RF  $F  T I ,  T  If 


1 

TERMINAL 

,TS,T?,MK,etc. 


FIGURE:  6.  The  types  of  templates. 


LIST  templates  have  a  special  interpretation  in  MDS. 
to:  the  C A(JS A LMODEL ,  GIAUCCMA,  as  we  nave  seen  before  we 

nave  (GLAUCOMA  testdesns)  =  (gtl  gt2  ...  gtw)  .  We  shall 
say  that  (GLA’JCCMA  testdesns  (gtl  gt2  ...  gtNj)  is  true  in 
me  data  uase  of  MDS.  The  relation  (GLAUCOMA  testdesns  (gtl 
gt2  ...  gtN)  )  is  interpreted  in  MDS  as  signifying  that  all 
t..e  relations  (GLAUCOMA  testdesn  gtl),  (GLAUCOMA  testdesn 
gt2),  ...  and  (GLAUCOMA  testdesn  gtN)  are  true.  f  We  shall 

interchangeably  use  the  singular  and  plural  rorms  of  a 
relation,  as  in  "testdesns”  and  "testdesn",  above].  Thus, 
i>  i  convention,  relations  normally  distribute  over 
co  iiec  t  i.cn  c. 


4.2:  The  relation  flags  and  tneir  us.  . 

In  each  anchor  of  tae  form  (X  r)  ,  where  x  i;>  t  •.  •  i ;  '  i  -  > 
aad  r  is  a  rt?lation  name,  one  may  associate  r  •  •  ia*  -  <::•  :  l  i  .  ■ 
to  indicate  a  variety  of  variations  on  the  mt,  rp:.-- 1  ition  or 
tue  relation,  r,  in  the  anchor.  Only  some  or  those  ire 
uiscussed  below.  For  a  more  complete  description  of  the 
relation  flags  and  their  uses  the  reader  is  referred  to 
f  orinivasan  1975b], 

(Flag  !  ]:  The  PROMPTING  flag  (*)  . 

A  flag  often  used  in  the  CASNET  description  is  the 
.  ftCMPTlNG  flag,  !.  For  example,  we  lid  ve  in  the  CAUSALKODEL 
template,  the  relation  derinition, 

((testdesns  !)  (TESTDESN  IL)  tostdesnsof ) 

j  J->e  Appendix  A].  The  exclamation  :narj\.  is  here  associated 
with  thj  relation  "testdesns".  This  has  the  following 


JSSCnlPTION  OF  CASNET  IN  MDS..JUNE  1975.. 


Page  32 


significance: 

When  an  instance  of  CAUSALMODEL  is  created  the  system 
would  attempt  to  instantiate  every  anchor  in  the 
CAUSALMODEL  template  that  has  the  prompting  flag 
associated  with  it.  If  a  consistency  condition  (CC)  , 
or  a  transiorma tion  rule  (TR) ,  is  associated  with  the 
anchor,  and  if  it  were  possinle  to  find  the  called 
object  of  the  anchor  by  evaluating  the  CC  and/or  TR, 
then  the  system  would  find  (or  create)  the  appropriate 
called  object  and  instantiate  the  anchor.  If  no  CC  or 
TR  is  available,  or  it  an  appropriate  called  object  for 
the  anchor  cannot  be  found  by  evaluating  rue  CC  or  TR , 
then  the  system  would  prompt  the  user  tor  supplying  the 
appropriate  called  ODjcct  for  instantiating  the  anchor* 


i’aus,  in  the  case  of  the  (CAUS  AL'HCDZL  testa  esrs )  ,  since 
tuere  is  no  CC  or  TR  associated  with  it,  tne  system  would 
prompt  the  user  to  supply  the  appropriate  instance  of 
TESTDESNS  to  be  incorporated  in  the  instance  of  the  anchor, 
(CAUSALMODEL  testuosns) .  As  the  reader  may  notice,  the  same 
prompting  convention  holds  also  for  tne  relations 
"causalnetuef n"  and  "classifications",  in  the  CAUSALMODEL 
template.  The  use  of  this  prompting  flag  enaoles  one  to 
create  a  complete  CAUSAIMCDEL  for  a  disease  domain  by  just 
issuing  the  command: 


[IT  CAUSALMODEL  DISEASE-NAME]. 


Due  presence  or  the  prompting  flag  at  the  various  anchors  in 
tne  templates  describing  CASNET  would  then  initiate  a  whole 
series  of  enquiries  to  the  model  builder  to  complete  the 
instantiation  of  the  CAUSALMODEL,  and  the  instances  of  the 


*  The  use  of  the  PROMPTIN'!  ting  convention  was  suggested  by 
ii .  S.  Sr  id  hnr  in. 


JBJJBiPTION  OF  CA3N  ET  IN  MDS , , JUNE  1975.. 


Paqe  33 


templates  called  by  CAUSALMCDEL.  Tnus,  tne  prompting  tor 
cue  anchor  (CAUSALilODEL  causalnetdefn)  would  necessitate  the 
instantiation  of  CAUSALNETDEFN,  which  would  then  initiate 
further  promptinqs  to  qet  its  own  instantiation  completed, 
fnese  prompting  will  propoqate  in  "depth  first"  manner 
tarouqn  the  network  or  templates  beginning  at  the 
da  (]  SAL  MO  DEL.  An  example  of  the  modal  bunding  process  using 
JDS  is  discussed  in  secticn  5.1. 


[Flag  D  ]:  DEPTH  Flag. 

Normally,  when  an  anchor  is  instantiated,  like  for 
example  the  anchor  (LIKELIHOOD  in versewe ignt )  ,  for  an 
instance  x,  of  LIKELIHOOD,  it  wcuid  call  for  an 

appropriate  instance  of  the  called  template  ot  the 
anchor  to  be  supplied.  In  our  case  nere,  an  instance 
of  CCNDPROB.  Let  y  be  the  instance  of  CONDPROB 
assigned  to  (x  in verseweight)  •  it  is,  of  course, 
possible  that  the  instance  y  of  CONDPROd,  might  itself 
be  not  completely  instantiated--!. e.  some  of  the 
relations  in  CONDPROB  might  not  nave  seen  instantiated 
for  y.  If  the  anchor  (LIKELIHOOD  inverse  weight)  had, 
however,  the  D  flag  (indeed,  it  does) ,  then  the  system 
would  automatically  attempt  to  complete  the  instance  y 
for  all  the  relations  defined  in  CONDPROB.  This 

process  of  completing  the  instances  would  oroceed  to 
arbitrary  depth,  until  terminal  objects  (primitive  data 
types)  are  reached. 

In  the  case  of  CCNDPFOB  (see  figure  2  in  section  3)  the 
anchor  (CCNDPROB  nextprob)  calls  again  CONDPROb,  Thus,  in 
tuis  case  the  D  flag  would  cause  a  whole  series  ol  instances 
at  CONDPROB  to  be  created,  until  the  process  is  terminated 

t>y  the  CC,  CC92,  associated  with  (CONuPhud  nextprob). 

Unless  the  D  flag  is  cateruily  used  one  can  get  auto  loops. 


[Flag  V]:  The  VARIABLE  relation  flag 


DESCRIPTION  OF  CASNET  IN  MDS..JUNE  1975.. 


Page  34 


An  example  of  this  flag  occurs  in  the  anchor 
(T  ESTDES N  confidence).  The  V  flag  indicates  that  the 
relation  " conf idenc j"  is  a  variable  relation  in  the 
TESTDESN  template:  Not  every  instance  of  TESTDES N 

would  have  the  "confidence"  relation  defined  for  it. 
The  CC  at  the  anchor  (TESTDESN  confidence) ,  CC26,  would 
specify  the  conditions  under  which  an  instance  of 
TESTDESN  would  have  the  "confidence"  relation  defined 
for  it. 


[Flag  $  ]:  The  DUMMY  Flag. 

An  example  of  this  occurs  at  the  anchor  (TESTDESN 
costratio) .  The  $  flag  associated  with  this  flag 
indicates  that  the  instances  of  this  anenor  are  not 
stored  in  the  data  base.  Every  time  the  costratio  of 
an  instance  of  TESTDESN  i£  required  it  would  be 
computed  by  the  system  using  the  CC,  CC 33,  associated 
with  the  anchor  (TESTDESN  costratio) . 


[  Flaq  Cl:  The  CONSTANT  Flag  (*)  . 

Normally,  if  the  called  template  or  an  anchor  (X 
r)  ,  is  Y,  then  in  an  instance,  x,  oi  x,  (x  r)  will  have 
as  value,  an  instance,  y,  of  Y.  However,  if  the  anchor 
(X  r)  has  the  constant  flag,  C,  associated  with  it, 
then  in  the  instance  x,  the  value  cl  (x  r)  would  be 
simply  Y.  An  example  of  this  occurs  at  the  anchor 
(STATEDESN  status) .  The  called  template  ior  this  is 
STAIUS.  For  an  instance,  <ST  AT  E> ,  of  STATEDESN, 
(<STATE>  status)  will  also  be  STAIUS.  Because  of  the  C 
flag  one  level  of  instantiation  is  snipped.  Also,  the 
CC  and  TR  associated  with  (STATEDESN  status)  will  be 
moved  to  the  level  of  <STATE>.  Thus,  the  definition  of 
(<STATE>  status)  would  appear  as: 

(status  (STAIUS  TI)  statusof  CC  1 1 )  , 

where  CC 1 1  is  the  CC  associateu  with  (STATEDESN 

status). 


[Flag  >  ]:  The  INDIRFCT  Flag. 


The  indirect  flag  is  used 
TEMPLATES.  An  example  of  this 
STATEDESN  template  in  Appendix 
(STATEDESN  livelihood)  .  The 


only  in  TERMINAL 
flag  occurs  in  the 
A,  at  tne  anchor 
definition  of  this 


relation  occurs  in  the  STATEDESN  template  as  follows: 


Tho  use  or  CONSTANT  tlag  was  suggested  by  Joel  Irwin. 


■w 


Wii jJBiiJTIO*J  OF  CASNET  IN  HD3..JUNE  1975.. 


Page  35 


(  ( likeliuocd  C>!)  (IT  LIKELIHOOD  NIL)  line  iihoodof) 


For  an  instance  cf  STATADE5N,  say  STATEI,  (ST ATE  1 
likelihood)  will  be  also  (IT  LIKELIHOOD)  (IT  is  the 
Instantiate  Template  command).  Tnis  is  because  of  the 
C  tlaq  (see  [  Flag  c]  convention  aoove)  .  The  '•>••  tlaq 
indicates  that  the  flaq  immediately  following  it.  should 
be  attached  to  the  instantiated  aiicnor.  Inus,  iri  our 
case  (ST  AT  E 1  likelihood)  will  acquire  tae  flag  !,  cr  in 
other  words  the  lieklyhood  relation  uefi nation  in  the 
STATE1  template  will  look  like: 


(  ( likelihood  !)  (IT  LIKELIHOOD  NIL)  iixeli hood  of ) . 


This  would  imply  that,  when  STATU  itself  is 
instantiated,  a  prompting  woulu  occur  for  the  anchor 
(STATE1  likelihood) . 


f  Flaq  X]:  Ihe  TRANSITIVITY  flag. 

This  flaq  is  used  to  indicate  that  the  relation  at 
an  anchor  is  transitive.  An  example  of  tnis  occurs  at 
(STATEDESN  descendants).  The  CC  at  tnis  anchor,  CCr>, 
specifies  that  the  descendants  cr  a  STAlEDEiN ,  5TATE1 , 
are  precisely  the  states  tnat  are  ciuseuy  STATE  1 . 
However,  since  the  descendents  relation  is  transitive, 
evecytime  (STATE1  descendant s)  is  asue-a  tue  system  will 
return  the  transitive  closure  of  what  is  stored  in  the 
data  base. 


f  Flag  HI:  The  REFLEXIVE  Flag. 

This  specifies  that  the  relation  at  an  anchor  is 
reflexive.  An  example  of  this  occurs  at  the  anchor 
(ENTRYDEFN  lowe ren tries) .  For  every  instance,  71,  of 
ENTRYDEFN ,  (E  loverentries  E)  is  true. 

This  completes  the  discussion  of  all  the  relation  flaqs 
used  in  CASNFT  description. 


<4.3:  The  CONSISTENCY  CONDITIONS. 


h.  3,1:  The  form  of  CC's 


JEJCtUPTION  OF  CAS  MET  IN  MDS..JUNE  1975.. 


Page 


As  we  have  seen  before.  every  anchor  may  have  a 
uuNSTRAINT  (CC)  and  a  TRANSFORMATION  RULE  (T  R)  associated 
with  it.  Conversely,  it  is  true  that  every  CC  has  an  unique 
anchor  associated  with  it.  This  is  not,  however,  true  for 
TR*s.  There  can  be  so  called  "floatinq  TR's”  which  are  not 
attached  to  any  anchor,  Floatinq  TR *s  have  not  been  used  in 
the  CASNET  description.  The  CC's  are  stated  as  expressions 
defining  sets  (collections  of  objects).  Let  (X  r)  be  an 
arbitrary  anchor.  Let  Y  be  the  called  template  of  (X  r)  . 
Ihen  the  CC  at  (X  r)  ,  CC[  X  rj,  will  have  the  form: 

f  (Y  y)  |  P  ( <£  y)  ]. 

Inis  is  to  be  read  as:  ”The  collection  of  all  instances,  y, 
of  the  template,  Y,  such  that  the  PREUICAIE,  P  (a  y)  is 
satisfied”.  Here,  the  distinguished  symbol,  m ,  refers  to 
tue  CURRENT  INSTANCE  of  the  template,  X,  tne  template  of  the 
anchor  of  the  CC.  The  CC  is  evaluated  at  a.  The  predicate 
in  a  CC  will  always  have  two  FP.EE  VARIABLES.  We  shall  refer 
to  a  as  the  ANCHOR  VARIABLE  (or  simply  the  ANCHOR)  of  the 
CC,  and  y  as  the  SET  VARIABLE  of  the  CC.  The  predicate 
itself  is  a  first  order  expression  using  function  symbols, 
relative  quantifiers,  and  logical  connectives.  He  shall 
refer  to  the  predicate  P  (<£  y)  as  the  SET  PREDICATE  of  the 
CC.  Let  us  consider  a  simple  example. 

Consider  the  anchor  (C AUS ALN ETUttN  startingstates)  . 
This  anchor  calls  the  template  STATEDLSNS.  Thus,  for 
iLAUCOMA,  the  startingstates  of  its  CAUSA  Lti  ETDE  FN,  will  be  a 


Ubi.aii'TIOS  OF  CASNLT  IN  MDS..JUNE  1975.. 


Page  37 


an  instance  of  S FA  TEBESNS ,  i.e.  a  collection  of  instances 
ot  STATEDESN.  The  CONSTRAINT,  CC 1 ,  is  associated  with  this 
anchor.  CC1  states. 


f  (STATEDESN  S)  |  ( d>  statedesn  S)  (S  causedby  NIL)  ). 


C„1  specifies  that 


"The  startingstates  ot  a  CAUSALNETDiiFN  is  the 
collection  of  all  instances  oi  STATEDESN,  S,  such  that 
S  is  a  statedesnof  the  CAUSALNETDtFh  (i.e.  S  is  a 
state  in  the  CAUSALNETDEFN)*,  and  there  are  no  ether 
states  that  cause  S. " 


As  before,  we  shall  use  the  name  GCAS'NET  for  the 
causalnetdef nof  GLAUCOMA.  Then,  CC1  wila  ue  evaluated  at 
( j CASK ET  startingstates),  and  the  ANCHOR  VALUABLE,  s,  of  CC1 
will  be  bound  to  GCASNET.  In  CC1  the  predicate  P  (a)  S)  is 
"(a)  statedesn  S)  (S  causedby  NIL)".  There  is  an  implicit 
aogical  AND  between  the  two  relations  in  P(aJ  S)  .  (*)  In 

tdis  CC  there  are  no  quantifiers.  We  shall  later  see 
examples  of  CC's  that  use  quantifiers.  In  the  context  of 
instantiation  of  an  anchor,  its  associated  CC  has  the 
following  interpretation: 


THE  IFF  INTEERPRETATION  OF  A  CC: 

Let  (X  r)  be  an  anchor,  and  let  Y  be  the  called 
template  of  (X  r)  .  Let  the  CC  at  (X  r)  ue  CC[  X  rj. 
Let  P(u)  s)  be  the  BET  PREDICATE  or  Cl)  X  r  1 .  Then,  for 
an  instance,  y  or.  Y,  it  is  true  that. 


( <i>  r  y)  <->  P  (u)  y) . 


*  Wo  shall  generally  omit  the  logical 
relational  predicates  occurring  in  a 
one  can  use  in  a  CC  ace:  V  (OR) 


AND 

CC.  Tin 

,  l  A  WO) 


sign,  C , 
T!i«>  logical 


between 

symbols 


(NOT) 


-> 


(IMPLIES),  <->  (IFF). 


9 


UEodhlPTION  OF  CASNET  IN  MDS  •  •  JUNE  1975.. 


Page  38 


1’nus,  for  (GLAUCOKA  startingstates) ,  a  canaidate  state,  say 
s,  can  be  the  starting  state  of  GCASNET,  if  and  only  if,  it 
is  a  state  of  GCASNET,  i.  e.  (GCASNET  statodesn  s)  is  true, 
and  no  other  stato  causes  s,  i.e.  (s  causedby  NIL)  is  also 
true: 


(GCASNET  startingstate  s)  <->  (GCASNET  statedesn  s) 

(s  causeduy  NIL)  . 

In  general,  the  relational  predicates  used  in  the  SET 
PREDICATE  will  be  the  dimensic na lly  consistent  relations 
defined  for  the  domain  under  consideration.  These  will  be 
typically  of  the  form  (x  r  y) ,  where  x  and  y  refer  to 
instances  of  templates  of  the  domain,  or  of  tne  form  (x 
r1:r2:..:rn  y) ,  where  r1:r2:...rn  is  a  RELATION  PATH 
(EEL  PATH) •  The  relation  path  above  has  the  following 
interpretation:  Starting  from  x,  if  one  traversed  in  the 
aata  base  the  relation  path  r1:r2:...rn,  in  ALL  POSSIBLE 
WAYS,  then  for  EVERY  SUCH  traversal,  y  will  be  among  the 
oojects  in  the  data  base  reached  from  x  via  the  relation 
path,  if  (x  r1:r2:..:rn  y)  is  true.  Relation  predicates  of 
tne  form  (x  r  y)  may  have  one  of  three  possible  truth  values 
in  tne  NDS  data  base:  TRUE  (T)  ,  FALSE  (NIL)  or  UNKNOWN  (?)  . 
in  the  case  of  (x  r1:r2:...:rn  y)  its  trutn  value  will  be  ?, 
if  y  was  not  among  the  objects  reached  from  x  via  the 
relation  path,  for  some  traversal  of  the  path,  and  the 
collection  of  objects  so  reached,  say  ( y  1  y2  ...),  had  ? 
included  in  it-.  The  tmth  value  will  be  NIL  it  neither  y 


nor  l  were  included  in  ( y 1  y2 


.  ..)•  The  SET  prcMCATES  oi 


Page  39 


.  •  u&aCRi PTION  OF  CASNET  IN  MDS..JUNE  1975.. 

Lw’s  are  evaluated  in  three  valued  logic,  wnere  T  dominates 
i,  ?  dominates  NIL,  and  -»?  =  ?  (i.e.  (TV?)*  T,  <?  V 

NiL)  =  ?) .  While  writing  CC's,  one  may  wnere  convenient 

omit  one  or  more  relations  from  a  relation  path  of  the  form 
rl : r2 :• • • : rn.  The  system  will  find  the  shortest  relation 
path  that  is  dimensionally  consistent  for  tne  given  (x 

r1:r2:...rn  y) .  Thus,  for  the  CASNET  domain,  as  described 

here,  the  form  (s  causedby  NIL)  is  not  dimensionally 

consistent,  since  the  "causedby"  relation  is  not  uefined  for 
a I'ATEDESN  and  s  is  an  instance  of  STATEDESN.  In  CASNET,  for 
Si'ATEDESN,  only  (STATEDESN  causes  CAUSEDESNS)  ,  (CAUSEDESN 
state  STATEDESN)  ,  and  (CAUSEDESN  causedby  3TATEDFSH)  are 
aimensicnally  consistent.  Thus,  only  a  CAUSEDESN  may  have 
tne  relation  "causedby"  associated  with  it.  In  the  case  of 
(s  causedby  NIL)  the  system  will,  therefore.  automatically 
interpret  it  as  (s  stateof : cause dby  NIL).  Her-  (s  stateof) 
will  ue  a  collection  of  CAUSEDESNs.  For  all  tne  CAUSEDESNs 
m  (s  stateof)  the  (causedby  NIL)  suffix  should  hold  true. 

Let  us  now  consider  an  example  of  CC  that  uses 
guantiriers.  In  the  domain  of  CASNET  tnere  are  not  many 
. • ;» c  h  cc  '  s.  h  si  in  pie  one,  CC92,  occurs  at  the  ancaor 
(CONDPRCB  nextprob).  This  says. 


JEoJaitTION  OF  CASNET  IN  MDS..JUNE  1975.. 


Page  **0 


f  (CONDPROB  x)  |  ((SOME  S)  (<i)  causestate:  descendant  S) 

(S  presence  CONFIRMED) 

(NOT  (  (SOME  Q) 

(<2  ca usestate:  descendant  0) 
(Q  descendent  S) 

(Q  presence  DENIED))) 

(  (CCNDPROB  C) 

(d  causestate: causestateof  C) 
(NOT (C  eftectstate  S) ) ) )  ] 

Tne  template  for  CONDPROB  specifies  that  (CCNDPROB  nextprob) 
is  (IT  CCNDPROB  NIL),  i.e.  a  new  instance  of  CONDPROB. 
Since  CONDPROB  is  a  NODE  template  and  (CONDPROB  nextprob  (IT 
CONDPROB))  is  dimensionally  consistent,  the  system  would 
ituow  that  tor  any  given  instance  of  CONDPROB,  say  pi,  (pi 
nextprob)  should  be  a  unique  instance  of  another  CONDPROB, 
say  p2,  provided,  of  course,  CC 92  is  true.  Thus,  when  (pi 
nextprob)  is  instantiated,  the  system  would  automatically 
generate  the  new  instance  p2,  and  check  wnether  the  SET 
PaEDICATE  of  CC92  is  satisfied  for  the  free  variables  (pi 
p2).  The  anchor  here  is,  of  course,  pi,  and  p2  is  the  set 
variable.  Let  us  first  examine  the  context  of  this  CC. 

As  the  reader  may  notice  in  figure  2,  CONDFFOE  is 
called  by  the  anchor  (LIKELIHOOD  inversewe ignt ) ,  where 
i-tKELIHOOD  itself  is  called  by  STATEDESN.  Tnus  ror  a  state, 
say  xyz,  its  likelihood  will  be  an  instance  of  LIKELIHOOD, 
say  h,  and  the  inverseweiqht  of  h  will  te  pi.  Inis  is  shown 


ui  the  diagram  below: 


Best 

Available 

Copy 


eESJilIPTTON  OF  CASNET  IN  (IDS. .  JUNE  1975.. 


Page  41 


(<STATE>  xyz) - likelihood  —  >(HKfiLIHOCD  h) 


causestate 


inverse  weiqat 


(CONDPROB  pi) — nextprob— >  (COKCPP.OB  p2)  . 


eftectstate 


(all  the  states,  S ,  that  are  descendents 
of  xyz,  and  have  their  presence 
CONFIRMED,  with  no  intervening  DENIED 
states  between  xyz  and  5,  In  addition, 
there  is  no  CCNDPP05 ,  p,  .ctner  than  pi, 
havinq  tnc  same  causestate  xyz,  that  has 
S  as  its  ef f ectstate) . 


rue  causestate  of  pi  will  be  xyz  (this  is  specified  by  CC93 
of  the  anchor  (CONDPROB  causestate)).  In  CC92  above,  let  us 
substitute  pi  for  d.  This  is  what  will  happen  ir  CC92  is 

evaluated  at  pi.  Also,  the  set  variable  will  new  be  p2 .  we 
wish  to  find  out  the  truth  value  of  CC92(p1  p 2 )  .  The 
constraint  says  the  following: 

For  soae  descendant,  S,  of  xyz  (notice  tnat  xyz  is  the 

causestateof  pi)  5  presence  is  CONFIRMED,  with  no 

intervening  LEVItD  states  between  xyz  anu  s,  and  S 

itself  .i  .j  net  the.  efi  *ct3tate  cf  any  COii.VtOi' ,  p,  whor.  «= 

causestate  is  also  xyz. 

lae  reader  nay  follow  this  interpretation  of  CC92  by 

substituting  "xyz  "  for  every  occurrence  of  "  (i 

causestate:".  If  no  such  state  S,  can  be  round  then  p2 

cannot  be  instantiated  as  the  nextprob  oi  pi.  Thus,  in  the 
CJ  above  "(SOME  S) "  is  used  in  the  sense  or  " (I HER FSXISTS 
S) 


.•« , 


tO.t'TION  OF  CA5NET  IN  MDS..JUNE  1975.. 


Fage  42 


In  general  the  quantified  expressions  in  SET  PREDICATES 
ua  ve  the  following  forms  and  interpretations.  The 
quantified  expression 

((SOME  ^TEMPLATE-*  X )  Q  ( X)  ) 

is  interpreted  as 

(  (THEREEXISTS  X)  (  (  <TEMELATE>  instance  X)  &  Q  (X)  )  )  , 

waere  Q  (X)  is  any  arbitrary  predicate  expression.  The  form 
(  (SOME  X  J  Q  (X) )  is  interpreted  as  ((SOME  **  X)  Q  ( X) )  .  The 
form 

(  ( <TEM  PLAT  E>  X)Q(X)} 
is  interpreted  as 

((ALL  X)  (<TEMPLATE>  instance  X)  ->  Q  (X) ) )  . 

Tae  form  ((X)Q(X))  is  interpreted  as  (  (**  X)O(X)).  Thus, 

qaantif ications  are  always  relative  to  the  instances  of  a 
template. 


4.3.2:  The  interpretation  of  CC’s. 

rfe  have  already  seen  the  IFF  interpretation  of  a  CC. 
Tnere  is  also  an  IF  interpretation.  It  is  possible  in  many 
situations  tnat  one  wishes  to  specify  a  SET  PREDICATE  of  the 
form  Q  (d)  s)  ,  such  that,  for  an  anchor  (X  r)  ,  its  associated 
=  -l*  rl,  an  instance  x  of  X,  and  instance  y  of  the  called 
temp. ate  Y  of  (X  r)  ,  it  is  true  that. 


DiSjaiPTIO!!  OF  CASNET  IN  HDS.. JUNE  1975.. 


Page  43 


IF  INTERPRETATION  CF  A  CC; 

(x  r  y)  ->  Q(x  y). 

i'uuSf  if  one  asserts  (x  r  y)  then  C(*  y)  should  be  true, 
tijwever,  it  ;  (x  v)  is  true,  it  does  not  necessarily  fellow 
tuat  (x  r  y)  true.  he  shall  refer  to  CC 1 s  of  this  type 
ny  t. ,4«»  nan  >h JIABATIVE  CC's.  The  value  oi  an  anchor 
instance  oar  b«.  cii  joK°a  using  a  DECLARATIVE  CC,  only  if  a 
candidate  value  is  giver  to  it  (declared  to  it).  In  the 

r 

case  of  If r  interpretation  cue  ci:i  olt.-n  rind  a  vaiu-:-  for  an 
anchor  instance,  if  cue  existed  in  tao  lata  rase,  hv 
evaluating  its  associated  CC.  Ihe  CC  at  anchor 

4  CAUSAL  NET  Dl.Ff;  starxingstutc-s)  is  an  Iff  CC 
IdPERATIVE  CC)  •  In  this  case,  the  evaluation  or  tae  CC  can 
imd  all  the  STATEOESNs  in  the  data  base,  .  .ich  arc  the 
starting  states  ox  a  CAOSALNETDEF N • 

He  shall  write  the  declarative  CC,  or  an  anchor  (X  r) , 
an  the  forn, 

( ( <TP!EP1AT  E>  y)  |  (i  r  y)j(*  Y)  )  • 

Tais  is  to  he  read  as,  Mif  y  is  declared  to  bo  the  value  of 
(4  r)  then  Q  (9  y)  should  be  true.  The  use  of  y  as  the  SET 
VARIABLE  in  CCf X  r  1  enables  us  to  write  declarative  CC's  in 
tais  torn.  An  exaaple  of  a  declarative  CC  occurs  in  the 
anchor  (CAUSALN FTD2FN  desiqnatedstates) ,  CC4: 

f  (STATFDESN  S)  |  (9  design* tedstate  3)  (a  statedesn  S) 

( NOT ( 9  startingstate  S) )  )• 


jue  nay  read  this  as 


ddsCalPTION  OF  CASHET  IN  MD3..JUNE  1975,. 


Page  44 


"If  5  is  declared  to  he  the  desig  natedstate  of  a 
C  A  USA  LNETD  EF  N  then  (a)  statedesn  S)  and  (NOT  (a) 
startingstate  s)  )  should  be  true." 

ir  no  desiqnatedstate  is  declared  then  the  evaluation  of  CC 4 

will  return  the  value  ?. 


In  MDS  the  commands  used  to  instantiate  a  relation  are 
tne  IR  and  ASSERT  commands.  (Fcr  the  purposes  of  this 
report  the  reader  may  think  of  them  as  synonyms.)  (IR  (x  r 
yi)  would  attempt  to  assign  y  as  the  value  cf  (x  r)  .  (IR  (x 
rq  )  would  attempt  to  first  find*  the  appropriate  y  (or 
collection  of  y's)  such  that  (x  r  y)  is  true,  and  assign 
tais  newly  found  value  as  the  value  cf  (x  r)  .  Clearly,  a 
prerequisite  for  being  able  to  find  the  values  y,  for  a 
given  (x  r)  is  that  the  CC  associated  with  the  anchor  (X  r) 
oe  an  IFF  CC,  or  there  be  present  a  TR  taat  finds  the 
appropriate  y.  Thus,  at  the  end  of  defining  all  the  states 
of  a  C  AUSAL  NET  DEFN ,  like  3CASNET,  and  all  the  causal  links, 
one  may  merely  issue  the  ccmmand  (IR  (GCASNET 
scartingstates) )  to  find  and  set  all  the  starting  states  of 
GCASNET. 


The  command  (IR  (x  r  y))  will  succeed  only  if  CC[X  r](x 
y)  does  not  evaluate  to  NIL,  and  at  ALL  the  ether 
anchors  (z  rl)  ,  the  following  is  true:  if  (z  r  1 ) 

depended  on  the  value  of  (x  r)  ,  (*)  taon  CC[  Z  r1](z) 
.(**)  also  does  not  evauate  to  NIL.  While  evaluating 
CC(2  rlj(z)  the  value  of  (x  r)  will  be  Hypothesized  to 


*  (Z  cl)  is  said  to  depend  cn  (x  r)  if  (x  c)  occurs  in  the  SET 
PREDICATE  of  CCfZ  rl  ].  That  is  the  the  truth  value  cf  the  set 
predicate  in  CCfZ  rl]  depends  on  the  value  of  (x  r)  . 

**  Notice  that  in  the  evaluation  cf  CC[  Z  rl]  only  the  anchor 
...i  beinq  bound  to  z.  The  sot  variable  is  not  being  bound.  In 
tins  case  t he  values  or  (z  rl)  existing  in  tne  data  base  will 
Uj  used  as  bindings  Lor  the  set  variable-;. 


-"•■w 'w~‘w \r  ’"ir;  -  •  *"  *"  *7  fr  *r  v  ^  r;  ■■ 


DESCRIPTION  CF  CASKET  IN  MD5...JUNE  1975.. 


Pa<je  45 


be  y.  If  CC[X  c](x  y)  ,  or  any  of  the  CCtZ  r1](z)  evaluated  to 
NIL  then  y  will  not  iee  accepted  as  the  value  of  (x  r)  • 

Any  tine  a  request  fo  the  value  of  (x  i)  is  made,  and 

ue  stored  value  of  (x  r)  in  the  data  base  is  UNKNOWN  (?) , 

toe  system  will  automatically  evaluate  tue  CC  and  T9 

associated  with  the  anchor  (X  r)  *  and  try  to  find  the  y, 

such  that  (x  r  y)  i3  true.  If  such  a  y  is  found  then  the 

stored  value  of  (x  r)  will  be  updated  in  the  data  base. 

Inis  process  of  fixing  the  values  of  instantiated  anchors  is 

used  in  CASNET  to  instantiate  aany  of  the  non- prompted 


iastantes  of  anchors,  whenever  a  need  ror  their  values  is 
encountered.  For  example,  when  the  CC  for  (LIKELIHOOD 
pro!  bility)  is  evaluated  it  would  call  for  the  values  of 
tue  associated  instances  of  (LIKELIHOOD  roruardveight)  and 
(LIKELIHOOD  totalinverseweight) .  The  values  for  these  two 
relations  will  then  get  instantiated. 

This  coapletes  our  discussion  of  CC*s.  We  shall  not 
discuss  here  the  foras  and  interpretations  of  transformation 


rules.  We  shall  explain  them  in  the  next  section  in  the 
context  of  their  use. 

5.0:  The  RODEL  DEFINITION  and  HODEL  INSTANTIATION 

Processes. 

The  structures,  constraints  and  tcansrorma tions  in  CASNET 
inscription  embody  implicitly  all  the  sequential  preejsses 
uecessary  tor  the  instantiation  of  models,  consistent  with  the 
specified  constraints,  and  also  for  answering  questions  about 
tue  CASNET,  instances  of  D1SEASEN0DEL,  or  instances  of 


JioJiil  PTION  OF  CA3NET  IN  MDS..  JUNE  1975.. 


Page  46 


t»I  SEASEEESN,  In  the  context  of  interactions  with  a  user,  MDS 
can  unravel  the  sequential  processes,  implicit  in  the 
descriptions,  that  are  appropriate  for  the  context,  and 
execute  them  in  the  proper  manner.  In  this  section  we  shall 
illustrate  some  of  the  kinds  of  ineractions  that  can  take 
place,  and  discuss  the  processes  that  are  triqyerred  by  them, 
rue  commands  used  in  the  interactions  (the  IT — Instantiate 
Template — ,  IR — Instantiate  Relation —  ana  ASSERT),  and  the 
control  structures  involved  in  the  execution  cf  the  commands 
are  qeneral  control  structures,  that  are  part  of  the  MDS 
system  itself.  In  the  cotext  of  the  KNOWLEDGE  about  CASNET, 
described  to  MDS,  the  particular  manifestations  described 
uelow  take  place.  The  sequential  processes  intiated  by  the 
commands  depend  upon  the  KNOWLEDGE  described  to  MDS.  In  the 
context  of  KNOWLEDGE  for  a  different  domain,  the  sequential 
processes  will  be  quite  different.  In  this  sense  MDS  is  a 
MuTA  SYSTEM.  Its  operation  depenus  upon  the  KNOWLEDGE 
described  to  it.  As  part  of  its  instantiation  process  it  can, 
if  necessary  invoke  its  general  proolein  solving  machinery. 
Tuis  machinery  contains  both  a  THEOREM  PhOVEK ,  and  a  GOAL 
JuIENTED  PP03LEM  SOLVER  (called  DESIGNER)  ,  that  can  plan  and 
execute  actions.  It  is  this  feature,  that  maxes  MDS  a  very 
powerful  system.  The  general  problem  solving  facilities 
enable  MDS  to  UNDEPSTAIO  the  descriptions  given  to  it 
(uopetully  in  a  way  anloqous  to  the  way,  that  an  intelligent 
human  beinq  would  understand) .  A  fairly  detailed  discussion 
ji  the  operation  of  MDS  appears  in  [Srinivdsan  1975a]. 


*  DEiChltfTION  OF  CASNET  IN  HDS..JUNE  1975.. 


Page  47 


In  the  context  of  CASNET  one  nay  tuink  of  the  sequential 
Processes  in  teras  of  the  following  tasks: 


[A]*  Generation  of  CAUSALHODEL  for  a  qiven  disease. 

[  B].  Using  CAUSALHODEL  to  generate  descriptions  of  disease 
afflictions  in  patients.  There  are  taree  subtasks 
here: 

1.  Record  Keeping. 

2.  Application  of  Tests. 

3.  Diagnosis  and  Therapy  rcccssendations. 


Ia  this  section,  we  shall  discuss  the  way  tnese  tasks  are 
performed  by  HDS  using  the  description  of  CASKET. 


5.1:  The  Definition  of  CAUSALHODEL. 


The  acdel  definition  process  is  initiated  by  the  system 
coanand  . 


(IT  CAUSALHODEL  disease-naae). 

inis  would  create  an  instance  of  the  CAUSALHODEL.  and  assign 
the  given  disease-naae  to  the  acdel.  Me  shall  illustrate  a 
.alt  the  aiodel  building  process  for  a  scgikchv  ol  tue  iLA'JCC/.A 
disease  (*)•  Let  the  disease  naae  be  HI NIGLAUCOHA.  A 
siaulatcd  saaple  session  of  the  fern  shown  in  figure  7.  would 
be  typical  of  the  kinds  of  user  interactions  that  could  take 
place. 


iuu  wasm^Ii'  is  discussal  in  greatcu  det.il  in  (Moiss  1974  1.  pp 


Page  48 


JdaOrtlPTION  OF  CA5NET  IN  MDS...1UNE  1  975.. 


{IT  CAUSALMODEL  MI  NIG  LA UCCM A  ] 
MINIGLAUCCMA  causal ne tdefn :stateuesns. . . 
( Increased- in t ra ccula  r- p  res sure 
Cupping -of -opt ic -disc ; net ve-dama ge 
Loss-of- vision) 


Increased-intraccular-pressure  starting  weigh  t. .  .0.2 
...  causes. ..a  CAUSEEESN 

...  causes: state. .. Cup ping- of -o  p tic -disc ; . . . 

...  causes: transit ic nprou. .. 0.  o 

...  threshold...? 

Cupping-of -optic-disc ; nerve -damage 
star  ting weight. . .  ? 
causes. ..a  CAUSSDESN 
ca uses: state.. ,L oss- of-vision 
causes:transitionprofc_0. 8 
threshold. . . ? 

Loss-of- vision  causes...  NIL 

...  threshold...? 

MINIGLAUCCMA  coramonthreshol d. . . 4 

(Increased-intraccular-pressure  thresnola  4) 
(Cupping-of-optic-disc ; nerve-damage  threshold  4) 
(Loss-of- vision  threshold  4) 


i 


\ 

i 


FIGURE  7: 


A  part  of  the  CAUSALMODEL  acquision  process. 


I 

Because  of  the  prompt  flags  associated  with 
"causalnetdefn" ,  "testdesns" ,  and  "classifications"  these 

•  relations  will  be  instantiated  in  the  order  they  appear  in  the 
template.  The  instantiation  of  (MINIGLAUCOMA  causalnetdef n) 
wrll  cause  a  new  instance  of  CAUS ALNETUEFN  to  be  created  and 

•  assigned  as  the  causa lnetdef not  HIN 1  GLAUCOMA.  In 
Ja  USALN  ETDEFN  the  ralations,  "sta tv; deans",  "causedesn s"  and 
"comon  threshold"  have  prompt  flags.  These  relations  will  now 

•  o e  instantiated  in  the  crder  shown.  This  wilx  generate  the 


i 


i 

t 


re  quest 


JSSS&IPTION  OP  CA3NET  IN  HDS..JUNE  1975.. 


Paqe  49 


"MINI GLAUCOMA  ctatedesns.  .  .  " 

u  the  model  builder,  as  shown  in  figure  7  auove.  The  usee 
night  respond  with  a  list  of  naves  of  <STATE>s,  as  shown  in 
tae  figure.  Zn  response  to  this  the  system  will  create  an 
lustance  of  STATEDESNS,  and  assign  toe  given  <ST.\TE>s  as  the 
eivaents  of  the  instance  of  STATEDESNS,  and  assign  the 
STATEDESNS  itself  as  the  (RllJIGL  AUCORA  stuted®sns) .  The 
elements  ot  STATEOESNS  have  got  tc  be  instances  ot  STATEDESN. 

d 

Since  the  indicated  <>STATEs>  do^  not  already  exist  in  the  data 
base,  the  systea  will  create  for  each  <SIATE>  a  new  instance 
of  STATEDESN  with  the  given  naae.  Each  suen  new  instance  of 
STATEDESN  will  in  turn  cause  new  promptings  to  be  generated, 
to  coaplete  the  instantiation  the  relations  in  STATEDESN  that 
nave  the  prompting  flag.  Thus.  for 

lacreased-intraccular-pressure.  its  "startinqweight", 

"causes",  the  "state"  and  "transitienproo"  ot  the  causes,  and 
the  "threshold"  would  be  acquired.  The  thresnold  is  specified 
in  figure  7.  to  be  (UNKNOWN).  Similar  promptings 

generated  for  the  ot  ho  r  STATES,  are  sa  own  in  the  figure.  The 
last  proaptinq  in  figure  7  is  for  (R 1NIGIAUCCRA 
coaaont breshold) *  This  is  set  to  be  4.  This  will  interact 
with  all  the  instances  of  the  anchor  (STATEDESN  threshold), 
because  the  relation  "coaaonthreshold"  appears  in  the  CC , 
iw«>7,  at  the  tnchor,  (STATEDESN  thresuolu),  as  mown  below: 

CC67:  CC[  STATEDESN  threshold  ]. 

f  (THRLSIWLD  I)  |  (  (J  threshold  1)  V 

(a*  statedesnof :cou»cntnresuor  1  I))  J 


ubSCai.’TION  OF  CASNET  IN  MDS.  •  JUNE  1975.. 


Pag e  50 


faus,  the  instantiation  of  (MI  NIGLA  UCOM  A  common tnresnold)  will 
cause  the  system  to  evaluate  the  CC67  at  tne  "threshold"  of 
every  instance  of  STATEDESN  so  far  created.  This  evaluation 
will  new  cause  the  thresholds  for  all  the  tnree  instances, 
suown  in  figure  7,  to  be  inferred  as  being  4.  Tne  result  of 
this  process  is  printed  out  to  the  model  builder,  as  shown  in 
tne  figure. 

In  this  manner,  guided  by  the  prompting  flags, 

4 

strategically  placed  at  the  various  anchors,  the  causalnetdef n 
process  will  ultimately  acquire  the  model  for  MI NIGLAUCCMA. 
at  the  end  of  this  model  definition  process,  the  model  builder 
may,  if  necessary,  delete  any  of  the  definitions,  and  add  new 
ones.  Thus,  if  a  new  STATE  is  to  be  added  to  the 
l MINI GLAUCOMA  causalnetdef n :state desns)  one  may  simply  issue 
tne  command  (ASSERT  ( HI  NIG  LAUCCM  A  causalnetdef n : st atedesn 
xyz))  where  xyz  is  the  new  STATE  to  be  added.  This  will  now 
initiate  once  aqain  the  appropriate  interactive  session.  At 
any  point  one  may  stop  the  model  building  process,  and  start 
it  later.  Also,  it  is  not  essestial  that  the  facts  of  the 
model  should  be  presented  in  the  oruer  impled  by  the 
collection  of  prompting  flags  in  CASNET  description.  They  may 
bo  supplied  in  any  order.  The  system  will,  at  each  state, 
verify  that  the  facts  supplied  do  not  violate  any  of  the 
constnints,  MDS  uses  the  descriptions  to  seek  at  each  point 
cue  appropriate  new  information  necessary  to  complete  the 
model  building  process.  If  at  any  time  the  specifications 
violate  any  of  the  constraints,  it  would  supply  the  user  the 


DEdJAiPT ION  OF  CASNET  IN  HDS..JUNE  1975.. 


Page  51 


reasons  for  the  violation  and  seek  ■edifications*  It  is  also 
possible  that  the  nodel  building  process  was  terminated  by  the 
user  prematurely.  In  this  case*  if  one  atteapted  to  use  the 
defined  model  for  instantiating  a  DISEASEDESN,  and  if  in  this 
process  cne  or  aore  of  the  Hissing  inforaation  in  the  nodel 
definition  were  required,  then  at  the  appropriate  points  of 
the  DI3EASEDE5N  instantiation  process,  MLS  could  proapt  the 
user  for  the  aissing  inforaation,  update  its  aodel  definition 
and  proceed  with  rest  of  the  instantiation  process*  This  kind 
of  operation  is  aade  possible  because  HDS  does,  in  a  sense, 
understand  the  descriptions  qiven  to  it.  Let  us  now  proceed 
to  exaaine  the  aodel  instantiation  process. 

5.2:  MODEL  INSTANTIATION  process. 

5.2.1:  The  Pecord  Keeping  Process. 

The  aodel  instantiation  process  aay  be  started  by  the 
coaaand, 

(ASSEPT  (EINIGl.AUCOiA  disease dest.  (IT  DISEaSLOKSN) ) , 

if  one  van  ted  to  create  a  new  instance  of  DISEA SEDESN.  Else, 
one  sight  in  soae  way  refer  to  an  existinq  DISEA SEDESN  that 
aiqht  have  been  previously  (aay  be  partially)  instantiated. 
Every  time  a  new  instance  is  created,  the  first  phase  of  the 
process  would  be,  what  we  have  called  as  hecord  Keeping.  This 
pnase  would  be  completed  by  the  promptings: 

_  (N IN IflI.AU COHA  diseisodesn:cdusalaodttl  niNIGLAUCCMA) 


Ji26  Jxil  l*T  ION  OF  CASNET  IN  MDS..JIJNE  1  975.. 


Page  52 


[This  is  done  because,  "ca usa lmcdel"  has  been 
declared  to  be  the  name  of  tne  inverse  of 
"diseasedesns" ,  in  the  C A DSALMODEL  template.  ] 

_  (MINIGLAUCCMA  diseasedesn :date  10-12-75) 

_M INIG  LAdC  CMA  diseasedesn : disease of ...  P EASON  John 

[This  would  cause  either  a  new  PEx<SO!l ,  John,  to  be 
created,  or  pick  out  the  John  that  already  might 
exist  in  the  data  base.  If  a  positively  new  John 
PERSON  is  needed,  then  one  might  say,  for  example, 
NEW  PERSON  Jchn.  In  this  case  tae  following  addi¬ 
tional  promptings  will  occur  to  complete  the  des¬ 
cription  of  John.  ] 

_J  chn  sex.  .  .  HA  IE  , 

_Jcnn  age. ..68 


Notice  that  the  (DISEASEDESN  date)  did  not  get  prompted, 
oecause  the  (DATE)  function  supplied  the  uate.  This  fact  is 
one  of  the  items  displayed  in  the  beginning.  [  Ine  date  shown 
is  probably  the  date  at  which  the  (IDS  implementation  would 
nave  prcgre-ssed  enough  to  execute  the  CASNET  uescribed  here!  ] 
As  a  result  of  this  kind  of  record  keeping,  one  may  at  a  later 
occassion  refer  to  the  diseasedesn  of  a  patient,  as  follows: 


[ (DISEASEDESN  x)  |  (John  disease  x) (x  date  10-11-76) 

(x  causalmodel  MINIGLAUCCMxi)  ]. 


aopefully  nc  more  than  one  DISEASEDESN  would  satisfy  this 
request!  [One  may  for  example,  choose  to  distinguish  between 
Djl SEA SEDESNs  for  people  with  same  name,  created  on  the  same 
aate,  for  the  same  disease,  by  assigning  nuaes  to  the 
descr  ipt  ions.  ] 


5.2.2;  The  Test  selection  and  Application 
Processes . 


[a  ].  The  Test  Selection  Process, 


jd  ddiil  r'TION  OF  CASNET  IN  HD3..JUNE  1975.. 


Page  53 


Part  2  of  the  model  instantiation  proems,  which  embodies 
toe  application  of  the  tests,  represents  tne  heart  of  the 
process.  Based  on  the  structure  of  the  tests  as  designed  by 
tne  Model  builder,  a  Mixture  of  predetermined  and  dynaaicaly 
determined  test  applications  will  determine  the  particular 
configurations  cf  the  causalnet  cf  the  DIIiEASEOESM.  We  shall 
oriefly  examine  below  the  iaportant  parts  of  tuis  process. 

For  convenience,  let  us  call  the  DISEAdEDtiSli.  GOES N.  The 
test  application  phase  will  begin  witn  tue  instantiation  of 
(GbBSN  causaluet) ,  which  is  the  next  relation  with  prompting 
flag.  in  the  DISEASEDESN  template.  This  will  cause  a  new 
instance  of  CAUSALNET  to  be  created  anu  set  as  tne  causalnet 
of  GO ESN.  In  the  CAUSALNET  the  relations  "states"  and 
"causes"  will  now  be  instantiated.  Let  us  denote  the  new 
instance  of  CAUSALNET  by  GNET.  (GHET  states)  -  ill  consist  of 
one  instance  of  each  STATE  in  NINIGLAUCGH A.  so  also.  (GNET 
causes)  will  consist  of  one  instance  or  each  <CAUSB>  in 
dlNIGLAUCOHA.  These  instantiation  processes  take  place 
because  the  called  templates  for  (GNET  state.*)  and  (GN2T 
causes)  are  respectively.  (IT  (?  CC40))  and  (IT  (?  CC41)). 

These  two  consistency  conditions,  shown  below,  bind  the 
acqunents  of  the  function  calls  to  IT.  to  the 

(MI NI'JLAlICCn A  causal  tietdein  istateuesi.s)  ai.d 

(NINTGLAUCONA  causalnet def n: causedesns) , 

respectively: 

CC40:  [  (T  EMPLrt'J'hG  X)  |  (a  causalnet  l*.'tn:stat.*de!?ns  X)] 

CC41:  f  (TEMPLATE  X)  j  (a  causaluetu .>iu:cdds«->d<rsns  X)  ]• 


DEJJKlPTinN  OF  CASNET  IN  KDS..JUNE  1975 


Faqe  54 


Taose  instances  cf  <5TATEVs  and  <CAlISE^s  will  now  form  the 
sxfleton  of  the  new  diseasedesn  to  be  created  for  the  new 
patient.  In  each  <STATE>  and  <CAU5E>,  again  the  relations 
with  prompting  flaqs  will  qet  instantiated  in  the  appropriate 
manner.  Once  this  is  uone  MDS  will  move  on  to  the  next  anchor 
with  prompt  flaq  in  DISEASEDESN,  namely  (DISEASEDESN 
topleveltest) .  A  new  instance  of  TOPLLVELIEST  will  be 
created,  and  since  the  relation  "topleveltest"  in  DISEASEDESN 
has  the  D  flaq,  the  system  will  ncvi  proceed  to  complete  all 
tue  relations  in  the  new  intance  of  TOPLEVELTEST.  (Let  us 
denote  the  new  instance  of  TOPLEVELTEST  by  tltestl.)  This  will 
result  in  the  followinq. 

TOPLEVELTEST  has  five  relations:  "currenttests", 

"nexttest",  "selectedtests",  "descend ents"  and  "ancestors", 
lue  currenttests  will  get  instantiated  first.  This  will  cause 
its  associated  CC ,  CC39,  to  be  executed.  CC39  is  shown  figure 
8.  It  may  be  paraphrased  as  follows: 

The  tests  to  be  applied  in  a  given  situation  of  the  model 
instantiation  process  are  selected  trom  the  currenttests 
of  the  current  TOPLEVELTEST.  The  currenttest,  M,  should 
not  be  one  of  the  already  selected  tests,  auu  in  addition 
several  other  conditions  are  requireu  to  ue  satisfied. 
Let  us  follow  the  conditions  specified  in  figure  3. 

The  followinq  objects  in  the  neighborhood  of  tltestl  are 
first  picked  out: 

MINIGLAUCOil  A - test  do  :ns — >N 

I 

causalmodei 

I 

I 

GNET--topl eve  ttest-->ti test  1 

The  variable  X  in  the  CC  will  be  bounu  to  ill  N IGL AIJCON A , 


JESCaiPTION  OF  CA5NET  IN  MDS..JUNE  1  975.. 


Page  55 


the  variable  CN  to  GNET,  and  N  to  (MINIGLAUCOMA 
testdesns) , 

There  are  two  cases  to  consider:  One  is  ( w  ancestor 

MIL),  (this  is  the  case  for  tltestl.  The  other  is  -•((£ 
ancestor  NIL).  This  case  will  apply  ror  the  descendants 
of  tltestl.  In  the  case  of  tltestl  ,  t no  WIPEOUT  test  of 
the  MINIGLAUCOMA  will  be  chosen.  The  WxPEOUI  test  will 
be  such  that  it  is  not  the  nexttest  or  any  other  test, 
and  also,  it  is  not  the  nextnegativetest  of  any  other 
test.  Once  this  selection  is  made,  it  will  be 
instantiated  as  the  currenttests  of  tltestl. 

It  a  WIPEOUT  test,  Q,  had  -just  previously  been 
chosen  and  applied,  then  we  will  be  in  the  case  ->  (a 
ancestor  NIL).  In  this  case  the  currenttests  will  be  (Q 
nexttest)  if  (^  testresult  is  YES)  ,  else  it  (Q  testresult 
is  NO)  then  the  currenttests  will  be  (Q 
neqati venexttest) ,  it  such  nexttests  exist. 


JEaCAiPTION  OF  CA5NET  IN'  FIDS..  JUNE  1  975.. 


Paqe  56 


(  (TE5TDESN  M)  | 

( (TOPLEVSLTEST  W )  (  (a>  ancestor  W) -> 

-i  (W  selected  test  M) ) ) 

((THE  CAUSALMODEL  X)  (THE  TESTEESNS  N) 

(THE  CAUSALNET  CN) 

((SOME  TESTDESN  Y) 

( (J  ancestor  Y )  V  (a)  is  Y)) 

(Y  topleveitestof  CN)  (CN  causaimodel  X)  ) 

(M  tostdesnor  X)  (X  testdesns  N) 
f  (a)  ancestor  NIL) 

(M  testtype  WIPEOUT)  (M  nexttestof  NIL) 

(M  neqati venexttestof  NIL) 

V 

-i  (i  ancestor  NIL) 

[  ((THE  **  Q)  [d)  nexttestof  Q)  (Q  testtype  WIPECUT) 
(  (Q  testresult  YES)  (Q  naxttest  M) 

V  (Q  testresult  NO)  (C  neqativenexttest  M) ) ) 

V 

((THE  STRATEGY  s)  (N  strateqy  s) 

(s  influences  TESTS) 

f  (s  is  MI  NCOS!)  (M  elemol  (SM1N  N  cost)) 

V 

(S  is  MAXWEIGHTHI NCCST) 

(il  ai'f ectedstate 
(SMAX  ( (**  2)  | 

(Z  iinstanceot  :ii  nstanceof 

STA1EDESN) 

(CN  state  Z) 

(  (SMIN  N  cost)  af fectedstate  Z)) 
lixelihocdiprofcability)  ) 

V 

{£?  is  MAXWEIGHTCCSTRATIC) 

(M  elemof  (SMAX  N  costratio) ) 

V 

(s  is  MAX WEIGHT) 

(M  af fectedstate 

(SMAX  (CN  states) 

liXelihocdiproLaPility)  ]  ]) ) 

"iinstance"  stands  fcr  "immediate  instance". 

"instance"  is  a  transitive  relation,  "iinstance"  is  not. 


FIGURE  8:  THE  CC  for  the  selection  or  "curreii ttests". 


itlPTION  OF  CASNET  IN  EDS. .JUNE  1975.. 


Pago  57 


If  all  tne  WIPEOUT  test  had  been  completely  applied, 
we  will  still  be  in  the  case  -*  (i  ancestor.  NIL)  .  In  this 
case  the  selection  of  currenttests  aepends  on  the 
strateqy  used  tor  MI  NIG  LA  UCC  M A.  T  ne  strategy  will  be,  of 
course,  the  strategyof  N ,  the  testdesnsof  MIN 1GLAUC0M A. 
If  the  strategy  influences  TESTS,  (here  "TESTS"  will  be 
an  instance  of  the  INFLUENCE  template),  then  there  are 
four  cases  to  be  considered  depending  upon  whether  the 
strategy,  s,  is  MINCOST,  itA  XW21GHTMINC0ST, 
MAXWSIGiiTCOSTHATIO,  or  MAXWEIGUT. 

The  words  "SHIN",  and  "SMAX"  in  tne  CC  in  figure  9,  refer 
to  function  templates  that  have  been  declared  to  the  domain. 
Tae  definitions  of  these  function  templates  appear  as  follows: 


f  TDN:  SMIN  (FNDEF  LIST  DT  (RELPATH  CCSd)  )  ] 
f  TEN:  SMAX  (FNDEF  LISTDT  (RELPATH  CC59)  )  ]. 


Tae  constraints  CC53  and  CC59  specify  that  tne  RELPATH  saould 
be  such  that  for  all  X  in  the  first  argument  (a rg  1)  of  the 
function  (notice  that  the  first  argument  is  a  LISTDT)  the 
RELPATH  (Relation  Pa*h)  should  be  dimensionally  consistent 
Kith  X.  Also,  (X  <RELPATtI>  :  *)  should  be  an  INTEGER  or  NUMBER 
(*),  The  SMIN  and  SMAX  functions  picx  out  all  the  X*s  (there 
can  be  more  than  one),  in  the  argl  cf  the  functions,  for  which 
t  .  I? :i>)  has  reap  etiveiy  the  MIN  and  MAX  values,  among 

tne  objects  in  argl. 


The  relation  paths  chosen  in  figure  8,  for  the 
various  strategies  and  associated  applications  of  SMIN 
and  SMAX  functions  are:  "cost"  tor  MINCOST,  tests  of 


*  The  relation  removes  the  dimension  of  a  number.  For 

example,  the  dimension  of  (John  age)  wouru  ue  YEARS,  because 
tne  aqe  oi  a  PERSON  has  been  defined  to  u<-  YEARS.  However, 
tue  dimension  of  (John  ago:*)  will  be  -just  NUMBER.  In  the 
case  of  collections,  the  #  relation  is  useu  to  rotor  to  the 
cardinality  oi  the  collection. 


iil  PTIQtJ  OF  CASNET  IN  H  D3 . .  JUNE  1  575., 


Page  58 


lowest  cost  are  chosen.  "Likelihood :  probability*1^  the 
case  of  MAXW EIGHT MIN COST ;  from  tne  set  of  tests  with 
minimum  cost,  a  test  is  picked  such  tnat  one  of  its 
affected  states  has  a  weight  greater  than  any  other  state 
affected  by  the  other  tests.  In  the  case  of 
MA XWEIGHTCOSTEATIO,  all  tests  with  the  maximum 
"costratio"  are  chcsen.  and,  in  the  case  or  KAXWEIGHT, 
all  tests  whose  af f ecte dstates  nave  the  maximum 
likelihood :  probability  are  chcsen.  In  ail  these  cases, 
it  is  raguired  that  the  chosen  SI EAT EG ¥  should  influence 
TESTS. 


fit  should  be  noted  that  the  selection  criteria  specified 
here  are  all  fixed  for  the  CASNET  system  at  the  time  of 
definition  of  the  CASNET  system  itseli.  J 


After  instantiating  the  currenttests,  the  system  would 
move  to  the  instantiation  of  the  next  anchor  in  TOPIEVELTEST, 
namely,  tue  anchor  with  the  relation,  "selectedtests".  The 
selectedtcsts  will  be  the  curranttest,  ir  the  currenttest  is 
unique,  else  the  user  will  be  prompted  for  the  cnoice  of  one 
or  more  of  the  currenttests.  The  selection  or  the  tests  is 
controlled  by  CC1Q3, 


CC103:  CCf  TOPLEVELTEST  selectedtests]. 

t  (**  X)  |  ((a)  currenttests:#  1)  (a  currenttest  X)  V 

(2  selectedtests  X)  (2  currenttest  X))  ] 

Notice  that  one  of  the  disjuncts  in  the  aoova  CC  wculd  pick 
the  selectedtests,  if  the  number  of  currenttests  is  1,  (a) 
currenttesrs:  *  1).  If  not,  the  phrase,  "  (o»  selectedtests  X)  " 
indicates  that  the  selected  test  snould  ue  specified  by  an 
OAternal  source.  When  (TOPLEVELTEST  selected  tests)  is 
instantiated  every  selected  test  will  be  instantiated.  This 
will  cause  th«  selected  test  to  be  applied.  *e  shall  discuss 


DESCRIPTION  OF  CASNET  IN  MD5...1UNE  1975.. 


Pago  59 


the  test  implication  process  in  section  5.2.2lb]. 


After 

t  he 

application 

of 

the  selec tedtests , 

the 

"  a  -ox  tte  st" 

of 

the 

TOPLEVELTEST 

will  re  instantiated. 

This 

•  ^11  again 

be 

an 

instance 

of 

TOPLEVELTEST.  Since, 

the 

relation  "topleveltest”  in  DISEASEDE3N  has  the  D  flaq 
associated  with  it,  this  new  instance  or  10PLEV £L TEST  will  new 
nave  to  be  again  completely  instantiated.  Thus,  the  whole 
process  will  iterate,  until  it  is  terminated  by  the 
"c  urren  ttests"  becoming  NIL,  for  some  TOPLEVELTEST  in  the 
sequence.  When  this  happens,  the  nexttest  also  will  become 
a 1 L ,  as  indicated  by  CC5 . 

The  next  relation  of  DISS ASELES K  to  be  prompted  is 
"aiagnosis/tharapy".  The  instantiation  or  this  anchor  is 
controlled  by  CC37.  This  is  used  to  pick  out  the  appropriate 
JJh  WESTS  depending  on  the  PATHWAYS  in  the  CAUSALNET,  and  the 
CLA  SSDEFNS  for  MI  NIG IAUCCMA.  We  shall  discuss  this  part  of 
tue  model  instantiation  process  in  section  5.2.  2lc]. 

m.  The  Test  Application  Process. 

As  mentioned  before,  in  section  3  each  <1EST>  has  only 
three  relations,  "ccstratio",  "testresult"  ana  "application”, 
mat  are  insta n tiable.  Tne  remaining  relations  in  TF.STDESN 
are  all  instantiated  to  constants.  Wuei.  tae  selected  tests 


are  instantiated,  the  system  will  be  still  under  the  control 
.u  the  D  flag  of  the  anchor  (DIS  EASIDE3N  tO|-j  v.  vext  ost)  .  This 


lEbCRiPTION  OF  CASNET  IN  MDS  ••JUNE  1  97  5 .  . 


Page  60 


•  ill  cause  all  the  above  three  relations  in  each  <1'EST>  to  be 
instantiated.  The  first  one  will  be  "testre suit" .  This  is 
controlled  by  CC34  and  TR 1 .  CC34  is  shown  oelow, 


((YESNO  Y)  |[  (a)  testresuit  Y)  (Y  among  (YES  NO  ??  NA) ) 

(a)  testtype:  among  (SQ  GROUP  WIPEOUT)) 

({SOME  **  X)  (a)  compcnentof  X) 

[  ((X  testtype  WIPEOUT)  -> 

((X  tesrtesult  NO)  (Y  is  NA)  V  (Y  is  ?) ) ) 
V 

((X  testtype  MC)  -> 

{(SOME  U)  (X  component  U)  -» (U  is  a>) 

{ (U  testresuit  ??)  (Y  is  ??)  V 

(U  testresuit*  YES)  (Y  is  NO)))  J  ]) 


Normally,  the  testresuit  should  be  supplied  by  an 
external  source.  In  certain  cases,  as  in  tne  case  of  earlier 
applications  of  WIPEOUT  and  MC  (multiple  choice)  tests,  (*) 
tue  result  may  be  determined  by  earlier  test  results.  In 
cnese  cases,  if  a  value  for  Y  is  supplied  from  an  outside 
source,  it  should  be  consistent  with  tne  conditions  specified. 
Let  tl  be  the  current  test.  By  an  analysis  of  the  various 
Cl's  in  CASNET  (MDS  would  perform  this  analysis)  it  may  be 
noticed  that  the  following  series  of  interactions  would  take 
place  whenever  the  testresuit  of  a  test,  like  tl,  is  changed: 


*  The  results  for  combination  t*3Sts  may  also  be  similarly 
taxon  care  of.  Weiss,  [Weiss  1974]  has  uotined  combination 
tests  as  a  separate  category  of  tests.  These  tests  are 
intended  to  take  cato  of  certain  <*inds  of  interactions  among 
tost  results.  In  the  MDS  context,  the  definition  of 
combination  tests  is  the  same  as  the  definition  of  new 
consistency  conditions,  pertaining  to  the  way  that  test 
results  affect  states.  For  this  reason,  we  have  not 
considered  combination  tests  as  a  separate  category  of  tests 
in  the  description  presented  here. 


oEdDHiFTION  OF  CASNET  IN  HUS. .JUNE  1975.. 


Page  61 


( t 1  testresult)  affects,  (X  status)  for  all  states  X, 
such  that  ( 1 1  a  tf  ec  tedsta  te  is  X),  We  snail  write  this 
as, 

f  TESTDESN  testresult]  interactions: 

[  (X  status)  ;  (  (STATEDE3N  X)  |  (a)  aft rectedstate  X))  ]. 

The  anchor  symbol,  i ,  here  actually  reiers  to  a  (TESTDESN 
iinstance :  iinst.ance)  .  This  is  because  one  level  of 
instantiation  was  SKipped  by  the  use  of  tno  C  flaq. 

Similarly,  the  following  ether  interactions  may  be 
identified: 

[ STATEDESN  status]  interactions: 

[  { a)  presence)  ], 

f  STATEDESN  presence]  interactions: 

[  (X  f erwardweight)  ; 

( (LIKELIHOOD  X) | 

(d)  stateof  :ca  uses:state:  likelihood  X))] 

f  ( X  totalweiqht)  ; 

( (LIKELIHOOD  X)  ] 

( <D  causes:stite:  likelihood  X))  ] 

(  (X  inverseweight)  ; 

((LIKELIHOOD  X)  |  (a)  likelihood  X))] 

[  (X  effectstate)  ;  ( (CCNDPROB  X)  J 

(a)  ancestor :causestateof  X)  ] 

r  (X  nextprob)  ;  ( (CONDPSOB  X)  | 

(a)  ancastor:causestateof  X)  ] 

...  etc. 


i  i  r  i  y  ,  on"  may  new  pe  i  u  se  through  i  he  various  interactions 
•  ;rv'i  by  (  11K2I  THOOb  f  er  v:a  j  d  v.ej  qn  t  |,  f  IIKHL1H00D 

totalweiqht],  etc.,  on  yet  other  anchors  in  the  system.  For 
our  purposes  here  it  suffices  to  note  that  a  whole  series  of 
iuteractions  and  side  effects  may  propoqate  through  the 
system,  every  time  the  testresult  of  a  test  is  changed.  HD5 
u  made  aware  of  this  via  the  definition  or  the  various  CC's. 
For  every  anchor,  MDS  will  build  the  interactions  list  or  the 


JiiiCRj.  BTION  OF  CASNET  IN  MDS . . JUNE  1975..  Page  62 


x'j rm  shewn  above.  These  infractions  list  will  be  used  to 
caeck  consistency  in  every  updating  process.  Everytime  an 
anchor  is  changed,  ilDS  can  access  and  check,  ail  the  other 
anchors  in  the  system  that  are  affected  by  it,  A  change  will 
be  accepted  only  if  it  does  not  produce  a  contr aaiction  in  any 
or  the  affected  anchors.  If  a  contradiction  is  produced,  or 
xu  general,  if  MDS  is  not  able  to  find  the  value  for  an 
anchor,  it  will  consult  the  TR  associated  with  the  anchor. 
Lot  us  consider  a  part  of  the  updating  process  associated  with 
tue  testresult.  The  transformation  rule  TR 1  is  associated 
with  (TESTDESN  testresult)  [actually  this  is  associated  with 
(12 STDS SN  iinstance:  testresult)  because  of  the  use  of  the  C 
fxag).  This  transformation  rule  is  shown  below: 


la  1 :  IRTTESTDESN  testresult] 

(ECONC 

(  (i  testresult  ?) 

(ASSERT 

(a)  testresult 

(ASXQUESTION  (a)  summaryquestion) ) ) )  ) 

[ ON-CONTRADICTION 
(((X)  (<2  af fectedstate  X) 

(NOT  (  (EXISTING  (X  status))  =  (NEW  (X  status))))) 
(NOT 
(DCOND 

(((ADD  (EXISTING  (X  status)) 

(NEW  (X  status) ) ) 

#:  =  0) 

(SOME  STATUS  S) 

(BIND  S  (EXISTING  (X  status))) 

(ASSERT  (X  conflict :  4:  #of  S) ) 

(ASSERT  (X  status:#  7))) 

(((ADS  (NEW  (X  status)))  > 

(ABS  (EXISTING  (X  status)))) 

( ASSERT  (X  status  (NEW  (X  status)))) 

(DCOND 

(  (  (ABS  (X  status:#)  )  > 

(ABS  (X  conflict:#))) 

(NOT  (X  conflict:#  0))) 

(ASSERT  (X  conflict:#  0)]]) 


JEbJhi  PTION  OF  CASNET  IN  MDS..JUNE  197  5.. 


Pago  63 


i'ue  function  AbS  in  the  rule  above  is  the  Absolute  Value 
runction.  DCOND  is  the  conditicral  statement  (DESIGNER  COND) . 
it  is  similar  to  tne  LISP  CCND  statement.  The  EXISTING  values 
used  in  the  rule  are  the  values  that  are  currently  present  in 
tne  model  space  of  CASNET.  The  NEW  values,  are  the  values 
that  we  wish  to  change  to.  "NET"  indicates  that 
do  BACKTRACKING  is  to  take  place  while  executing  the  portion 
within  its  scope.  In  general,  transformation  rules  in  IDS  are 
executed  in  a  backtracking  environment.  The  above 
transformation  rule  may  be  paraphrased  as  follows: 

If  the  testresult  is  UNKNOWN  then  asx  the  summ aryquestion 
of  the  test,  and  assert  the  answer  as  tne  test  result. 

If  a  contradiction  is  obtained  in  asserting  a  testresult 
the  do  the  following:  For  all  states  x,  that  are  the 
af factedstatas  of  the  test,  if  the  EXIST1N  3  status  of  the 
state  is  not  the  same  as  the  NEW  status,  tuen 

If  the  sum  of  the  NEK  and  EXISTING  status  is  0,  assert 
that  the  conflict  cf  X  is  equal  to  the  EXISTING  status  of 
X,  and  set  the  status  of  X  to  0. 

If  the  absolute  value  of  the  status  of  X,  is  greater 
than  the  absolute  value  of  the  EXISTING  status,  then  set 
the  status  of  X  to  the  new  status, 

Tf  the  absolute  value  of  the  NEVi  status  of  X  is 
jr  ».• <.:*  *  i.  n  ♦  is.)!  Hi;  n!  'i;  or  t  ik  coni  1  i  c4-  cf  X,  and 
the  ccniiict  is  not  0,  then  set  tne  conflict  to  0. 

The  presence  of  a  state  depends  on  its  status.  The 
caanges  in  the  status  brought  about  by  TR  1  might  contradict 
with  the  existinq  presence  of  a  state.  In  tnis  case,  the  CR 
associated  with  (STATEDESN  presence)  will  got  invoked  for 
taking  care  of  the  contra uictions.  This  TE  is  shown  below. 


TR2  :  TR[ STATEDES  N  presence], 

(NBT 

(DCOND 

(-»( (EXISTING  (a)  presence))  is  (NEW  (a)  presence))) 
(ASSERT  (a)  presence  (NEW  (a)  presence)))))) 

((**  S)  (a)  causesistate  S) 

(ASSERT  (S  for wardweight) ) ) 

(CCOND 

((NEW  presence  CONFIRMED)) 

((**  Q)  (a)  causedby  estate  Q) 

(ASSERT  (Q  in versewei^ht) )  j ) 

(T 

({**  R)  (a)  causesistate  R) 

(ASSERT  ( R  totalweight) )  ) 

(ASSERT  (<2  inversewe ight) ) ) ) ) 


it  not  only  fixes  the  new  presence  of  a  state  but  also  issues 
commands  to  recompute  for  all  states,  Q,  that  causes  <a,  their 
respective  inverseweights  (*)  ,  if  the  new  value  of  presence  is 
CONFIRMED,  Other  similar  commands  issued  by  TR2  may  be 
followed  by  the  reader. 


The  computation  of  candidatestates  and  Candida tetests  are 
controlled  by  CC50  and  CC51.  Here  again  strategies  are  used, 
as  specified,  at  the  time  of  model  definition.  Tne  reader  is 
invited  to  peruse  these  constraints,  shown  in  Appendix  A. 


As  mentioned  before,  the  test  application  process  will 
continue  until  the  next  TOPLEVELTEST  becomes  NIL.  At  this 
point  the  causal  net  would  reflect  the  full  consequences  of 
ail  the  test,  as  absorbed  by  the  descriptions  of  the  modeling 


*  *s  cue  reader  may  notice,  the  "  in  verse  weight”  is  not  defined  directly 
for  a  state.  It  is  defined  only  tor  (state  likelihood).  Thus,  when  (Q 
inveisawaigut)  is  asserted  (as  in  TR2)  ,  for  a  state  v>,  the  system  will 
interpret  it  as  likelihood;  in  verse  weig  ht) .  This  kind  of 

interpret! lion  is  possible  onLy  when  there  is  a  unique  way  of  executing 
tue  as  J  erf  i'm 


JESCu^PTION  OF  CASNET  IN  .'IDS. .JUNE  1975..  Page  65 

scheme,  The  next  pending  relation  will  be  tne  next  relation 
in  DISEASEDESN  with  the  prompting  flag.  This  would  be 
(oISEASFDESN  diagnosis/therapy).  We  shall  discusses  the 
processes  involved  in  the  instantitation  of  this  relation  in 
tne  next  subsection. 

f c  ].  The  generation  of  diagnosis  and  t.ierapy. 

The  process  of  diagnosis  and  therapy  in  CASNET  is  based 
ou  three  notions;  the  notion  of  tne  type  of  the 
classification  table,  the  notion  of  most  likely 
startingstates,  and  the  notion  of  admissible  pathways  from  a 
most  likely  starting  state.  At  the  time  of  model  definition 
one  may  specify  the  classification  table  type  to  be  SPECIFIC 
or  GENERAL,  by  instantiating  the  classtype  relation  of  the 
CL  A  SSDESN  template.  If  the  type  is  SPECIFIC  then  only  the 


CONFIRMED  states  will  be  looked  at,  when  tne  algorithm  of 
aiaqnosis  and  therapy  proceeds.  If  the  type  is  GENERAL,  then 
all  undenied  states  will  be  looked  at. 


The  diagnosis/therapy  is  under  the  control  of  the  CC, 
7,  at  the  anchor  (DISEASEDESN  diagnosis/therapy).  We  shall 
ociefly  discuss  this  CC.  The  CC  itself  is  shown  below: 


CC37:  CCf DISEASEDESN  diagnosis/therapy]: 

*  (COMMENT  C)  |  (  (CLASSEESN  E)  (THE  ENTRYDEFN  F) 

(E  f irstentr yslowe rentry  F) 

(F  er.  try  state:  i  ins  ta  nee:  presence  CONFIRMED) 
(NOT  (  (SOME  ENTRYDEFN  G) 

(E  i  irstentry  :  lowofiitr  y  J) 

(F  en  trysta t.e:  descendant: 

mi  t  r  ystnteoi  G) 

(G  en tryst* te : i instance : 

pi  ose  m>*  C oil  F'  1  PMEP  )  )  ) 
(  (  (E  classtype  SPECIFIC)  (INTniDiirN  i!) 


JsiJ.ttlt'TION  OF  CASNET  IN  MbS ..JUNE  1975.. 


Page  66 


(E  f irstentry : lowerentry  H) 

(H  entrystate: descendent;entrystateof  F) 

(H  entrystate:iinstance: presence  CONFIRMED)) 
V 

(  (E  classtype  GEN  ERAL)  (  ENTRYDEFN  I) 

(E  firstentry: lowerentry  i) 

(I  descendent  jentrystate  F) 

(I  descendent: entrystate  F) 

(NOT 

(I  entrystate: iinstance: presence  DENIED)) 
((SOME  PATHWAY  P)  (ENIRYDEFN  J) 

(£  f  irstentry :  lowerer.tr y  J)  (J  lowerentry  F) 

(J  entrystatejccmpcnentcf  P)) 

(F  comments: is  C) ] 


The  process  of  diagncsis/therarpy  is  viewed  in  CASNET  as 
one  of  extraction  of  an  entry  (an  instance  of  ENTRYDEFN)  which 
is  a  component  of  a  classification  table,  a  CLASSDEFN.  Each 
C-ASSDEFN  has  a  CLASSNAME  (like  SPECIFIC,  GENERAL  etc.)  ,  and  a 
"f irstentry",  which  is  an  ENTRYDEFN.  An  ENTRYDtrN  itself  has 
an  "entrystate”,  "descendents",  "comments”,  "nextentry”,  and 
tue  so  called  " lowerentries" .  The  lowerentries  of  aa 
tuiTRYOEFN  is  the  closure  of  its  "nextentry"  relation.  The 
descendents  of  an  ENTRYDEFN  is  the  collection  of  all  the 
STATEDESNs  of  its  lowerentries. 


CC37  selects  the  "deepest”  entry  in  the  classification 
table  in  the  follcwinq  sense:  The  entrystate  of  the  entry 
must  be  CONFIRMED;  the  entrystates  associated  with  the 
lowerentries  must  all  be  not  CONFIRMED;  it  the  CLASSDEFN  is 
of  type  SPECIFIC,  then  all  tho  entrystates  or  higher  entries 
j»ust  be  not  DENIED;  and  finally,  all  the  entrystates  of  each 
or  the  entries,  including  the  deepest  entry  must  all  be 
components  of  some  qiven  admissible  PATUWAY. 


mt'TION  OF  CARNET  IN  MDS..JUNE  1975.. 


Faye  67 


The  notion  of  admissible  PATHWAY  depends  on  the  most 
likely  starting  states,  (mlstartingstatesof  CAUSALNE1).  These 
states  are  computed  by  the  function  MLST ARTING3TATES  in  the 
aascription  shewn  here.  This  function  call  serves  as  an 
alternative  to  implementing  the  algorithms  as  part  of 
consistency  conditions.  Wherever  efficiency  considerations 
are  important,  then  the  necessary  a  Igor ltmas  may  be  directly 
implemented  as  functions  in  MDS.  These  functions  may  be 
called  at  the  appropriate  places  in  the  instantiation  process. 
Function  implemented  in  this  manner  will  always  "EXECUTE 
aLIND".  .’ICS  will  net  be  able  tc  monitor  the  function  while  it 
is  executing.  In  the  case  of  the  description  presented  here, 
tae  algorithm  tor  the  most  likely  statrtinq  states,  is  thus 
«  uot  anywhere  described.  The  algoritnra  may  oe  paraphrased  as 

i allows : 


A  starting  state  is  chosen  which  has  aiore  confirmed 
descendants  without  intervening  denied  descendents  than 
any  other  starting  state.  If  one  starting  state  can 
explain  all  the  confirmed  states  (that  is  all  the 
confirmed  states  are  its  descendents)  then  it  ty  itself 
is  the  complete  set.  If  there  exists  a  tie  between 
starting  states,  then  the  one  with  the  gretest  starting 
weight  is  chosen.  If  no  single  starting  state  can 
explain  all  the  confirmed  states,  then  considering  the 
remaining  starting  states  and  the  confirmed  states  not 
yet  explained,  the  process  is  repeated  until  either  no 
confirmed  states  remain  which  are  not  explained,  or  some 
confirmed  states  have  no  explainable  starting  states. 
The  set  of  states  that  explain  the  greatest  number  of 
confirmed  states,  then  become  the  most  lixely  starting 
st  at  es . 


A  state,  S,  lies  on  an  admissible  pathway  iron  a  most 
finely  starting  state,  SS,  if  it  is  a  uoscendent  of  SS,  if 


t  a  ore  exist  n>  int  ,rv>*niny  denied  states  between  S  and  SS,  ind 


UdoiJtUPTION  OF  CASKET  IN  MDS.. JUNE  1975..  Pa ge  68 

j.i  S  has  a  confirmed  descendent  with  no  intervening  denied 
Uescendents.  The  PATHWAY  template  incorporates  this  notion, 
via  the  CC's,  CC48,  CC49  and  CCbb,  anchored  at  the  relations, 
startinystates,  components  and  nextpathway,  respectively.  We 
snail  new  leave  it  to  the  reader  to  verily  that  tne  CC's  shown 
ia  Appendix  A,  do  indeed  perforin  as  described  above.  This 
completes  our  discussion  of  the  description  o£  CaSNET  in  MDS. 


We  have 

touched 

upo  n 

t  he 

essest ial 

aspects  of 

CASNET 

and 

g  aided 

the  reader 

throug  h 

their 

descriptions 

in  MDS. 

The 

reader 

should 

be 

able 

to 

4 

follow 

the  rest 

without 

m  uch 

art f iculty . 


JdJwUii'TIO'l  OF  CASS  ET  IN  MDS  ..JUNE  1975.. 


Page  69 


6.0  Concluding  Remarks. 

The  CASNET  description  given  here  is  based  on  the 
uiscussion  presented  in  f Weiss  1S7d].  Since  tnen,  the  CASNET 
has  undergone  various  modifications.  The  descriptive 
formalism  presenteu  here  would  make  it  ea^y  to  modify  or 
extend  the  system.  In  any  such  updating  process  the  MDS 
itself  may  be  consulted  about  any  of  the  existinq  parts  of 
CaSNET.  It  can  answer  questions  about  the  CASNST,  questions 
pertaining  to  the  various  structures,  or  questions  pertaining 
to  the  details  of  the  test  application  and  model  instantiation 
processes.  Thus,  the  description  cf  CASNIT  in  MDS  can  be  used 
also  as  a  documentation  of  what  CASNET  is.  Inis  has  important 
consequences.  This,  can  for  example,  make  it  possible  to  use 
CnSNET  in  a  teaching  or  testing  mode.  A  CAUSALMCDEl  defined 
m  MDS  may  be  used  to  teach  about  the  disease  process,  or  to 
cueck  the  answers  provided  by  people  who  are  oeing  tested  on 
their  knowledge  of  the  disease. 

■jut'  ;  uest  1 1;  ii o;:  th-i  CASN5-.  1’  :h  eu  not  ..«_-c.--osui  i  1  y  pertain 

airectly  to  the  description  of  CASNET  as  presented  to  MDS,  MDS 

can  also  answer  questions  whose  answers  would  have  to  be 

inferred  from  the  descriptions  of  a  domain.  The  Theorem 

Prover  in  MDS  may  be  used  for  this  purpose.  Thus,  one  obtains 
l  ±  ex  ibi  li  ty ,  clarity  or  expression  ana  versatility.  In  the 
early  staqes  of  MDS  development  we  anticipate  the  pay  a  heavy 
price  in  efficiency,  We  believe,  efficient  implementation  of 
r,L>:»  would  be  possible,  alter  some  experience  is  gained  with  an 


ulfTION  OF  CASNET  IN  MDS ..JUNE  197  5.. 


Page  70 


initial  working  version. 

Not  all  the  concepts  in  CASNET  have  been  captured  by  the 
description  shown  here.  The  concepts  of  counter  and 
combination  tests  have  not  been  described.  The  interpretation 
or  unknown  responses  (??),  is  left  vague.  Tnis  nas  been  left 
vague  also  in  the  CASNET  system  of  Kulikowsxy  and  Weiss. 
Snould  the  unknown  responses  be  considered  as  testresults  by 
tuemselves,  or  should  the  tests  involved  be  just  set  aside  for 
ueing  repeated  at  a  later  stage?  The  uxseasedcmain  model 
saown  here  never  uses  the  repeatability  relation  defined  for 
TESTDESN.  Eut  for  these  minor  omissions  the  de scri ptionCA SNST 
presented  herva  captures  the  rest  of  the  system.  To  illustrate 
tne  ease  with  which  modifications  and/or  additions  to  CASNET 
can  be  made  the  description  of  an  extended  concept  of 
treatment  is  shown  below.  The  system  is  mouitxed  to  pick  out 
tne  treatment  that  maximizes  some  cost  criteria, 

A  new  template  for  TREATMENT  is  added: 
f  TEN  :  (TREATMENT  3  N) 

(priority  (PRIORITY  TI)  priorityof) 

(status  (STATUS  TI)  statusof  CC dO) 

(treatment  (STATEMENT  TS)  statementof ) )  . 

CC80  will  be  identical  to  CC11,  the  CC  associated  with 
(STATEDESN  status).  Two  changes  in  the  existing  template 
structures  are  necessary: 

(a).  In  EFFECT  template, 

l EFFECT  af fectedstate  STATEDESN) 

will  become 

(EFFECT  af fectedstate  STATEDESN/TEEAXMENT) 

f STATEDESN/TKEATMENT  implies  "UNEOF  STATFDE5N  or 
TREATMENT"].  And, 


JESChlFTION  OF  CASNET  IN  MDS. .JUNE  1975.. 


Page  71 


I 


4 


« 


« 


I 


(COMMENT  therapy  STATEMENT) 
will  become 

(COMMENT  therapy  TREATMENT). 

The  CC37  at  the  ancnor  (DISEASEDESN  diagnosis/therapy) 
will  have  to  be  changed  as  fellows: 

"  (F  comments:is  C)  "  will  become 

»(C  treatmentof  (SMAX  ( (TREATMENT  R)| 

(TREATMENT  S) 

(|3  is  S)  V 

(R  sta tus: >=:  s tatusof  S) ) ) ) 
priori’t  y) )  " 

The  transformation  rule  TR1  need  not  be  changed.  The 
above  changes  will  cause  a  treatment  with  maximum 
priority  to  be  chosen. 

It  should  be  noted  that  if  one  were  to  design  CASNET  in 
d u S  it  is  most  likely  that  one  would  not  have  implemented  it 
as  shown  here.  We  have  here  deliberately  restricted  ourselves 
to  the  description  cf  Weiss’s  system.  The  reasoning  power  of 
MDS  is  not  used  very  much  in  the  process  of  model 
instantiation.  The  formalism  is  used  here  primarily  for  the 
description  of  an  existinq  program.  In  our  use  of  MDS  as  a 
design  tocl  for  the  BELIEVER  system  we  begin  to  see  the 
useful]  lies..  •>  .':  r;d)ism  to  express  complex  structures  and 

tneir  implied  use  in  a  variety  of  problem  solving  processes. 
Tn  is  is  discussed  in  [Sridharan  1  575  a,b}. 

At  the  moment  it  is  not  clear  to  us,  wu»-ther  the  power 
and  Ldcilities  availabl  in  MDS  are,  indeed,  needed  for  tne 
Kinds  of  problems  encountered  in  medical  mouelinq.  This 
report  should  contribute  to  the  making  of  that  decision. 


DZSCalPTION  OF  CASNET  IN  MDS  •  •  JUNE  1975.. 


Page  72 


7.0:  ACKNOWLEDGEMENTS: 

We  wish  to  thank  Shalom  Weiss  and  Casimir  Kulikcwski  for 
tnier  help  in  explaining  to  Joel  the  various  aspects  of  the 
CaSNET  implementation.  Many  of  these  aspects  could  not  have 
ooen  otherwise  obtained  without  going  through  the  laborious 
process  of  reading  the  programs. 


REFERENCES: 

[  «eiss  1974  ] 

•  siss,  Shalom.  "  A  System  for  Model  Based,  cooputer  Aided 
Dxagnosis  and  Therapy",  ,  Ph.  D.  Dissertation,  Department 
or  computer  Science,  Rutgers  University,  h.  J. 

[  Sridharan  1975  ] 

Sridharan,  N.S  "The  Architecture  of  EZLIBVE R-Pa rt  I", 
Department  of  Computer  Science,  RUCBM-TR46,  Rutgers  University, 


[Srinivasan,  1975] 

Srinivasan,  C. V...."The  Meta  Description  System."  RUCBM-TR50, 

Srinivasan,  C.V . "A  formalism  to  define  the  structure  of 

Knowledge.  •' 

Srinivasan,  C.V... .."The  use  of  Gentzen's  system  of  logic  for 
Tneorem  Proving  in  HDS, "  In  preparation, 

[Srinivasan  1973] 

Srinivasan,  C.V,. ..."The  ARchitecture  of  Coherent.  Information 
Systems"  Procae  lings  of  the  3rd  International  Joint  Conference 
ui  Artificial  Intelligence,  Aug.  1973. 


APPENDIX  I 


(t'DN: 


(TDN: 


l  ton: 


(TDN: 


CASNET  DEFINITION 


(CAIJS ALhQDEL  R  N) 

(diseasedesns  ( D  IS  HAS  LDES  NS  $L)  causalraodei) 
((test  lGsns  ! )  ( TE SIDES  NS  SL)  testuesnsoi) 

(  (causalnetdefn  !)  (IT  CAUSALNETDEFN) 

causalnet  def noi) 

((classifications  !)  (CLASS  DEFNS  $L) 

classif icationsoi ) ) 


(DISEASEDESNS  $L) 

(ELEMDN  (0  *  DISEASEDESN)  ) 

(ca  usalmodel  (CAUSALMODEL  FN)  aiseasede:>ns) 
( (diseasaof  V)  (PEOPLE  SL)  disease)) 


(TESTD2SNS  SL) 

(ELEKDN  (0  *  TESTEESN)  ) 

(  (testdesnsof  V)  (STIOOb  $1)  testdesns) 
((strategies  V)  (STRATEGIES  $L)  strateqiesoi) 
( (summaryquestion  V)  (STIOOb  JL) 

summary guestionot ) 
((counter  V)  (COUNTERS  $L)  counters!) 
((effects  V)  (EFFECTS  SL)  effectsof) 

(  (currenttestsof  V)  (T0PLEVELTE5 TS  SL) 

currenttests) 

( (componentsof  V)  (TESTDESNS  $L)  components)) 


(CAUSALNETDEFN  $N) 

(causedesns  (CAUSEDESNS  $L)  causedesnsot  CC7  1) 
(startinqstates  (STATEDESNS  $L)  startinqstatesof 
CC 1 ) 

(interiorstates  (STATEDESNS  $L)  interior statesof 
CC3 ) 

(uosiqnatedstates  (STATEDESNS  $L) 
designa tedsta tesof  CC4) 

(  (stated  =;sns  !)  (STATEDESNS  $1)  at  i  codes  nsof  ) 
(ter.ai:\dlstatos  (STATEDESNS  il)  terminal statesof 
CC  2 ) 

(  (cotniiionthroshold  !)  (THRESHOLD  l’tf) 

commontliresholdcr ) ) 


J USALN  FT  L)EFN- causedesns 

( jscc: 


.tiSNij'j.  DEFINITION  IN  MDS 


Page  1.2 


(QUOTE 

(  (CAUSSDSSN  C)  |  (STATEDESN  S)  (a)  statedesns  S) 
(S  causesielera  C))) 

CAUSALNETDEFN  causedesns) 

j  1 -Cad SAL N  ETDSFN-st  artinqstat€s 
U5CC: 

(QUOTE 

((STATEDESN  S)  |  (5)  statedesns  3) 

(S  causesof  NI L)  ) ) 

CAUSALNETDEFN  startinystates ) 

E 3- CA d SAL NETDEFN-in ter ior states 
(QSCC: 

(Q  L’OTE 

((STATEDESN  S)  |  (S  statedesns  S) 

(S  startinystates  S)  -«  («i  terminalstates  S ) )  ) 

CAUSALNETDEFN  in terior states) 

C4-CAJ  SAL NE IDE FN-desiqn a  ted states 
(QSCC: 

(Q  UOTE 

((STATEDESN  S)  |  (w  statedesns  S) 

(S  designatedsta tesof  d)  -> 

(a)  startinystates  S))) 

CA  USALNETDEFN  designatedstates) 

C2-CnUSALN  ETDEFN-terminalstates 
(QSCC: 

(Q  UOTE 

((STATEDESN  S)  |  (d)  statedesn  S) 

(S  causesof  NIL)  ) ) 

CAUSALNETDEFN  terminalstates) 


(TDN:  (CL ASSDEFNS  $L) 

(ELEMDN  (0  *  CLA SSDEFN) ) 

( (classi f  icationsof  V)  (ST1C06  3>L) 

class i  f  ica  tions) 

((classtype  V)  (ST1000  $L)  classt ypeof) ) 


(TEN:  (PEOPLE  $L) 

(ELEMDN  (0  *  PERSON))) 


(  2  DN  :  (DI5EAJZDESN  $N) 

(dite  (DATE)  dateof) 

(  (topleveltest  ID)  (IT  TOPLEVELTEST) 

topleveltestoi) 

(causalmode 1  (CAUSALMODEL  RN)  uiseasedes ns) 


Uiiitli'  DEFINITION  IN  MDS 


Page  1.  3 


{(diseasoof  !D)  (PERSON  RN)  disease) 
((diagnosis/therapy  !)  (COMMENTS  SL) 

diagnosis/tuera pyor  CC37) 
((causalnet  !)  (IT  CAUSALNET)  causainetoi) ) 

3  7 -Or  SE A SEDF3N- diagnosis/ therapy 
(vSCC: 

(QUOTE 

(  (COMMENT  C)  | 

(  (CLASS DEFN  E)  (THE  ENT  FY  DEFN  i) 

(E  f irstentry: lewerentries  F) 

(F  entrystate :iinstance: presence  CONFIRMED) 
(*’ 

(  (SOME  E  NT  BY  DEFN  G) 

(E  f irstentry :lo werentries  G) 

(F 

entry  state :  descend  errts  :e  lew:  entr ystateof 
G) 

(G  entr ystate :iinstance: presence 
CONFIRMED)  )  ) 

( (  ( E  classtype  SPECIFIC)  (ENTRYDEFN  h) 

( E  firstentry  :lowerentries  H) 

(H 

entry state : descend ents:e lea: entr ysta teof 
F) 

(ti  entry  state  :iin  stance:  presence 
CONFIRMED)  ) 

V 

( ( E  classtype  GENERAL)  (ENIRYDEFN  I) 

(E  firstentry slowerentries  I) 

(I  descendentsje lem rentry stateor : eiem  F) 

(“• 

(I  entry  state: iinstance; presence  DENIED)) 
{(SOME  PATHWAY  ?)  (ENTFYDEFh  J) 

(E  f irstentr yilowerentries  J) 

(J  lowerentries  F) 

(d  antry statejeomponentsof  P) ) 

(F  comments :is  C) ) )  ) ) ) 

*»1  SEAS :i  di ag no s i s/thera p y) 


(TDN:  (ST  1  006  SL) 

(2LEMDNS  (0  *  CAUS ALMODEL)  (0  *  QUESTION))) 


(i  r.N:  .  (STRATEGIES  SL) 

(ELF ML N  (0  *  STRATEGY)) 

(  (st.ra  ceg  iesof  V)  (7ESTDESNF  SL)  strategies)) 


(i’I’.N:  (COUNT  EPS  SL) 

(EL*M)NS  (0  ♦  COUNTER)  )  ) 


(>  DM:  (fcl  iFV.  i’S  Si.) 


T 


fcoWET  DEFINITION  IN  MDS 


Faye  1.4 


(ELEMDN  (0  *  EEFECT) ) 

( (ef fectsof  V)  (TESTDESNS  SL)  effects) 
((how  V)  (ST1005  IL)  reasontcr) ) 


(xDN:  (TOPLEV ELT  ESTS  Jl) 

(ELEMDN  (0  *  TOPIE VELTEST) ) ) 


(I’DN;  (TE5TD2S  U  MN) 

((repeatability  IV)  (YESNO  TA)  r epeatabi lity of 

CC  24) 

((cost  IV)  (COST  T #)  costof  CC25) 

((confidence  IV)  (CONFIDENCE  TI)  confidenceof 
CC26) 

((firsttest  !>CV)  (TESTDESN  MN)  firstestcf  C C 30) 

( ( nexttest  I >CV)  (TESTDESN  FIN)  previoustest  CC31) 
( (negati venexttest  !>CV)  (TESTDESN  KN) 

negativenexttestof  CC 3  2) 
((testtype  I)  (TESTTYPE  EN)  testtypeot  CC20) 

( (testresult  C>!)  (YESNO  TA)  testresultof  CC34 

TR1) 

((costratio  $>C)  (COSTKATIO  I#)  costratioof  CC33) 
( (previoustest  V)  (TESTDESNS  SL)  nexttest) 

( (firstestof  V)  (T S3TD2SNS  $L)  firsttest) 
((strategies  V)  (STRATEGIES  $L)  stra tegiesor ) 

(  (negativenexttestof  V)  (TESTDESNS  SL) 

negati  venexttest) 

((components  I)  (TESTDESNS  JL)  component. sof  CC21) 
( (summaryquestion  IV)  (QUESTICN  TS) 

sunimar  yquestionof  CC23) 

( (negati vedeterminancy  I)  (YESNO  TA) 

negativedete  rminancyof 
CC  27) 

((counter  IV)  (COUNTER  TI)  couriteror  CC28) 
((effects  IV)  (EFFECIS  $L)  cffectsoi  CC29) 
((application  C>!)  (IT  APPLICATION) 

applicaticncf) ) 


:i24-raSTDESN-repeat ability 
(WSCC: 

(QUOTE 

((YESNO  Y)  (  (a)  repeatability  Y) 

( (Y  is  YES)  V  (Y  is  NO) ) 

((.j>  testtype  WIPEOUT)  -■>  (Y  is  NO)) 

(-«  (u)  testtype  MC)  ) 

((Y  is  No)  ->  (i  i  instance:  %  :  =  s  1)))) 
TESTDESN  repeatability) 


:C2  J-x’SSTDESN-cost 
(wSCC: 

(QUOTE 

((COST  C)  ( 

(  (i>  cost  C)  (-«  ( testtype  WIPEOUT)) 
((«  testtype  Sq)  -> 


4JNC.T  DEFINITION  IN  MD5 


Paye  1.5 


("• 

{(SOME  TESTDE3N  M)  { M  com  pone nts : eiem  a) 

( M  testtypo  MC) } ) ) ) 

V 

(((a)  testtype  COMBI  NAT  ICN)  V 
( a)  counter  COUNTER)  ) 

(C  is  0)  )  ) ) 

TESTDESN  cost) 

xo-xLSTDESN-conf id e nee 
Isj  SCC: 

(QUOTE 

((CONFIDENCE  C)  (  (a)  confidence  C) 

(  (D  testtype  SQ)  V  ( testtype  COMBINATION)  V 
(-S  testtype  COUNTER)  V  (cb  testtype  GROUP)))) 
TESTDESN  confidence) 


J  0-T C SIDES N- first test 
(QSCC: 

(QUOTE 

( (TESTDESN  x)  |  (i)  firsttest  x) 

(a!  testtype  COLLECTION)  )  ) 
TESTDESN  firsttest) 


j 1-f ESTDESN-nexttest 
(v^SCC: 

(Q  UOTE 

( (TESTDESN  x)  I  (<u  nexttest  x) 

((0  testtype  WIPEOUT) 

((x  testtype  WIPEOUT)  V  (x  xs  NIL))) 

V  (c>  coioponentof  :  testt  y  pa  COLLECTION) ) ) 
1ESTDESN  nexttest) 

32-TESTUESN-neqativenexttest 
U  SCC: 

(QUOTE 

((TESTDESN  x)  |  (a)  n ega t i vene x ttest  x) 

(tu  testtype  WIPEOUT) 

((x  testtype  WIPEOUT)  V  (x  is  NIL)))) 
TESTDESN  negativenexttest) 

20-i'tl  STDESN-t  estt  y  pe 
(Q  SCC: 

(QUOTE 

((TESTTYPE  X)  | 

(((a)  components  NIL)  ( &  ccuntoi  NIL) 

(x  is  SQ)  ) 

((w  sunna ryquestion  NIL)  (u»  firsttest  NIL) 
(x  is  MC) ) 

((-*  (a)  summat  yy  uest.  io  n  NIL)) 

(-•  (<d  components  NIL)) 

((-«  (J  cost  NIL)  (x  is  GROUP) )  V 
(  x  is  WIPEOUT) ) ) 

(-.  (J  firsttest  NIL)  (X  is  COLLECTION)) 

((a)  counter:*:®  0)  (x  is  comb  i  naTJ  o  N) ) 


.aJNt.1'  DEFINITION  IN  MDS 


Pays  1.6 


((d)  counter: #:>  0)  (x  is  COUNTER))))) 

TESTDESN  testtype) 

Z  Cl 4- TESTDESN- test re suit 

(qscc: 

(Q  UOT  E 

((YESNO  Y)  | 

( ( i>  testresult  Y)  (Y  elemof  (YES  NO  ??  NA)) 

(j)  testtype: elemof  (SQ  GROUP  WIPEOUT)) 

(SOKE  **  X)  (d  componentsof  X) 

(((X  testtype  WIPEOUT)  ->  (X  testresult  NO) 

(Y  is  NA)  V  (Y  is  ?)) 

V 

({X  testtype  MC)  -> 

((SOME  TESTDESN  U)  (X  components:  elera  U)  -» 
(U  is  d>) 

( ( (U  testresult  ??)  *(Y  is  ??))  V 
((U  testresult  YES)  (Y  is  NO))))))))) 
TESTDESN  testresult) 

CC13-I ESTDESN-costratio 
(QSCC: 

(QUOTE 

((COSTRATIO  C)  | 

(C  is 

(SMAX 

((COSTRATIO  D)  t  (**  M) 

(M  iinstanceof :iinstancecf  STATEDESN) 

(<a  e£f ects:af f ecte Jstate  M) 

(D  is 

(DIVIDE  ( d>  cost  :#) 

(M  likelihood:  probability:  *)  ) ) ) 

NIL)))) 

TESTDESN  costratio) 

-12 1 -  I  iuSTDESN- components 
(QSCC: 

(QUOTE 

((TESTDESN  M)  |  (  compo  ne  nts  :e  le  m  M) 

(■* 

((a)  testtype  SQ)  V  (d)  testtype  COMuINATICN)  V 
(d  counter  COUNTER) )  ) 

( (d  testtype  MC)  ->  (M  testtype  SQ) 

(M  cost  NIL)) 

( ( i  testtype  GROUP)  -> 

( (M  testtype  SQ)  V  (tt  testtype  MC)  )  ) 

( ( J  testtype  COLLECTION)  -> 

( (K  testtype  MC)  V  (M  testtype  GROUP)) 

(-•  ( M  nexttest  NIL))) 

((i  testtype  WIPEOUT)  -> 

( ( N  testtype  SQ)  V  (M  testtype  MC)  J 
(M  testtype  GROUP) ) ) 

((-•  (d  testtype  WIPECUT))  -> 

((TESTDESN  N)  (N  componentsielem  M) 

( C N  is  <i)  V  ( N  testtype  WiP  cOUT) ) )  ) )  ) 


I&StU'T  DEFINITION  IN  .IDS 


Paqe  1.7 


TESTDESN  components) 

C2J-T2  STDESN-summar  yquestion 
(U  SCC: 

(QUOTE 

((QUESTION  C)  |  (a)  suma ar yyuestion  Q) 

((ui  testtype  SQ)  V  (di  testtype  GROUP)  V 
(3)  testtype  WIPEOUT)))) 

TESTDESN  s ummar y qu es ti on) 

Zi  7- TiiSTDESN-neqati  ve  deter  mi  nancy 

(viSCC: 

(QUOTE 

( (YES NO  I)  |  (d)  negativedetsrminancy  Y) 

((a)  testtype  SQ)  V  (3  testtype  COMBINATION)  V 
((6  testtype  COUNTER)  V  (3  testtype  GROUP)) 

( (Y  is  YES)  V  (Y  is  NO)  )‘)  ) 

TESTDESN  neg ati ved et er mina nc y) 

T2d-It SIDES N-counter 
(QSCC: 

(Q  UOT  E 

((COUNTER  C)  I  (a)  counter  C) 

( (a)  testtype  COMBINATION)  V 
(d>  counter  COUNTER) )  )  ) 

TE SID ESN  counter) 

Z 2 d-T2 STDESN-e fiects 
(«<  SCC: 

(QUOTE 

((EFFECT  E)  1  (3  effects  E) 

(  (3  testtype  SQ)  V  (<£  testtype  COMBINATION)  V 
(a)  counter  COUNTER)  )  )  ) 

TESTDESN  effects) 

til  -TESTEESN-testresult 

(«ST7: 

(QUOTE 

( DC  0  N  D 

(  (.;!  test  t  -".'.i  )t  l) 

(ASSERT 

(a)  testresult 
(ASKQUESTION 

(a)  suramaryguesticn)  )  ) ) ) 
(ON-CONTRADICTION 

((X)  (ti)  a  f f  ecte data  te  X) 

(■* 

((EXISTING  (X  status))  = 

(NEW  (X  status) )  )  )  ) 

(NOT 

(DCOND 

(((ADD  (EXISTING  (X  status)) 
(NEW  (X  status)  )  ) 

#  :=  0) 

(SOME  STATUS  S) 


a^NET  DEFINITION  IN  MDS 


Page  1,8 


U'DN 


(I  DN: 


l  i’dn: 


(  I  DN 


(I'DN 


(BIND  3 

(EXISTING  (X  status)  )  ) 

(ASSERT 

(X  conflict:  #  :  *of  S) ) 
(ASSERT  (X  status:*  0))) 

(((ABS  (NEW  (X  status)))  > 

(ABS 

(EXISTING  (X  status)))) 
(ASSERT 

(X  status 

(NEW  (X  status) )  )  ) 

(DCOND 

(  (  (ABS  (X  status:  #) )  > 

(ABS  (X  conflict:#))) 

(**  (X  conflict:#  0) )  ) 
(.ASSERT 
(X 

conflict:#  0) )))))))) 

TESTDESN  testresult) 


(CAUSEDESNS  $L) 

(ELEKDN  (0  *  CAUSELESN)  ) 

( (causodesnsof  V)  (ST  1011  $L)  causedesns) 
( (causesof  V)  (STATEDESNS  $1)  causes)) 


(STATEDESNS  $1) 

(ELEMDN  (0  *  STATEDESN)  ) 

( (startingstatesof  V)  (CAUSALNETDEFN5  SL) 

stactingstates) 

( (statedesnsof  V)  (CAUSALNETDEFNS  *L)  statedesns) 

( (interiorstatesof  V)  (CAUSALNETDEf NS  $L) 

interiorstates) 

( (designatedstatasof  V)  (C AUSALN ETEEFWS  *L) 

designa  ted  states) 

( (terminalstatesof  V)  (CAUSALNETDLEN3  »L) 

terminalstates) 

( (descendentsof  V)  (ST1012  $L)  descendants) ) 


(THELSriOLD  T*) 

(  (thiesholdof  V)  (STATEDESNS  SL)  threshold) 
( (commonthresholdof  V)  (CA  USALNLILEFNS  SL) 

common  t  h  res.  no  Id) ) 


( ST  1000  $L) 

(ELEMDNS  (0  *  ENTRYDEFN)  (0  *  CLASSNAHE) 
(0  *  LIKELYHOOD) ) ) 


(CLASS DEFN  $N) 

((Liistentry  !)  (ENTRYDEFN  $N)  t ir st entr yot ) 


;*  JHiii.'  D  EF I KIT10N  IN  MDS 


Faqe  1.9 


((classtype  !)  (CLASSNAME  PN)  classtypeof  CC 1 5) ) 

Jlo-CsASSDEFN -classtype 
SCC: 

(Q  UOT  E 

( (CLASSNAflE  C)  |  (®  classtype  C) 

(C  elemof  (SPECIFIC  GENERAL)))) 

CLASSDEFN  classtype) 


(  T  D  li :  (APPLICATION  $N) 

( (candidatestates  !V)  **  Candida testatesof  CC50) 
( (candid atetests  ! V)  **  candidaxetestsof  CC51) 
((nextchoice  !)  (IT  (?  CC52))  riextchcice  of) 

( (nexta pplication  !)  (IT  (?  CC65) ) 

nextapplicationcf ) ) 

Pd-AcPLICATION-candidatestates 

(1SCC: 

(Q  COTE 

((**  S)  | 

((a)  testtype  GROUP) 

(S  iinstanceor:  iinsta ncecf  S1ATEDESN) 

(THE  STRATEGY  X) 

(3)  applicationof : elemof : strategies:  eiem  X) 
(X  influences  STATES) 

(((X  is  GLOBAL)  (-.  (S  presence  DENIED)))  V 
(((X  is  LIKELYHYPOTHESIS)  V 
(X  is  POTENTIALH YPCTHESIS) ) 

(SOME  **  C) 

(0  ii nstanceof: iinstanceor  STATSDESN) 

(O  presence  CONFIRMED)  (S  descendants  C) 

('• 

((SOME  P) 

(P  iinstanceof :iinstanceo£  STATEDESN) 

(P  presence  DENIED)  (S  descendants  P) 

(P  descendants  0) ) ) 

((X  is  LIKELYHYPOTHESIS)  -> 

(S 


statesof : mlstartinqstates: result : descendents:  elem  S) ) ) ) ) ) ) 
APPLICATION  candidatestates) 

ol-aPPLICAT 10 N-c an  didate tests 
( *  SCC: 

( Vs  COT  r. 

((♦*  X)  I 

((a>  tor.tt  ypu  GROUP) 

(x  iinsl  anccoi :  ii  nst  xnceof  ll.STDL.SH) 

(u>  components  x)  (x  test  result  ?)  (SOME  B) 

(U  ii  nstanceot :  ii  list  *  ncoui  SI  AT  LDESN) 

(a>  Candida t estate  ')) 

(<A  e t  feet. :  a  ttectedstu  te  B) 

(x  confidence; < :>:*of  (APS  (ti  status)  ,)) 


CAS Sal  DEFINITION  IN  KDS 


Faye  1.10 


APPLICATION  candidatetests) 
Ci>2-4PPLICATION-nextchoice-argdn<1>lT 

WSCC: 

(U  DOT  E 

( (TESTDESN  M)  j 

(((J»  testtype  GROUP)  (THE  STRATEGY  S) 

(fi  strategies:  elem  S)  (S  influences  TESTS) 

(THE  **  N)  (a!  candidatetests  N) 

(  (  (S  is  MINCOST)  (M  elemof  (SHIN  N  cost)))  v 
(  (S  is  MA  XW  EIGHT  MI  NCCST) 

(M  af fectedstate 
(SMAX 

(<**?>  I 

(X  iinstancecf :iinstanceof 
STATBDESN) 

(3  candidatest'ate  X) 

(X  aff  ectedstateoi  (SMIN  N  cost))) 
lik  elinood : probability) )) 

V 

( (S  is  MAXW EIGHTCCSTRATIO) 

(M  elemof  (SMAX  N  ccstratio))) 

V 

(  (S  is  MA  XWEIGHT) 

(M  effects :affectedstate 
(SMAX  (a  candidatestate) 

likelihood  jprofcability) ))) )  V 

((a  testtype  COLLECTION) 

(M  iinstanceof :iinstanceoi  TESTDESN) 

(M  testresult  ?)  (d)  components  fl) 

( (  ( (SOME  **  x) 

(x  iinstanceof liinstanceof  TESTDESN) 

(a  components  x)  (x  nexttest  M) 

(-*  (x  testresult  ?) )  ) 

V  (a  firsttest  M) ) 

V  (M  is  NIL) )  ) 

V 

((a  testtype  MC) 

(M  iinstanceot :iinstanceoi  TESTDESN) 

(a  components  M)  ( M  testresult  i) ) 

V 

(((a  testtype  SQ)  V  (a  testtype  COMBINATION) 

V  (a  testtype  COUNTER)) 

(M  is  NIL))))) 

APPLICATION  noxtchoice  argdn  1' 

oo-.u'  PL  ICA I  lON-nex  tapplicaticn  -argdn  <  1>IT 

tw see: 

(U  uo  r  e 

((♦*  TT)  I 

((a  candidatetests: * :<  2)  (TT  is  NIL))  v 
(TT  is  APPLICATION))) 

APPLICATION  nextapplicat ion  argdn  1) 


AJNiil  DEFINITION  IN  HD S 


Faqe  1,11 


{TDK :  (PERSON  RN) 

{ (tii  sense  V)  (  DI  SEA SKDE5  NS  -L)  di3e<iS‘i:>;  ) 
(aqe  (ASKYEARS)  aqcot)  (sex  (ASKSiX)  sexoz)) 


(TUN:  (TCI  LE7LLT25T  IN) 

(  (d-cce:.  lent  Xt)  (TOFIEVCLIEST  oN)  CCH] 

( (aiu:  2X)  (TO  RLE  VC  ITS  ST  V: )  a-:;iCda-«:  l  CCHii) 

(  (ouiranttests  ! )  ( .  EJTDE-i  '*  >  -  C)  cuueutto.it  scf 

CC  *  ■) 

((.;*•  i»jr  •  it  !)  (I*5’  (?  CC1C3)  )  select  eat.-' st.ro  f ) 

((!>:xv.-  ;t.  !)  (IT  (?  CCo)  )  p re vioustest) ) 

1  al-  topi.  -  I L  : -r.?- abscond er.  t 

l^SCC:  (C !J0  7  C  (  (TOPLEVEITEST  X)  |  ( d  nexttest  X))) 

TO  FI  EVEL7rsr  ioscenden  i) 

1  02 -10  PL EVEL TEST- an cos tor 
( WSCC: 

(Q  UOT  z 

(  (TOP LEVSLTEST  X)  | 

((d)  topieveltestof  NIL)  ->  (X  is  NIL)))) 
TOPLEVELTEST  ancestor) 

29 -U  PLEV EL TEST- cur rent tests 
SCC: 

(QUOTE 

(  (TESTDES  N  M)  | 

((TOPLEVELTEST  w) 

( (a)  ancestor  W)  ->  -»  (W  relectedtests  M )  )  ) 
((THE  CAUSALMODEL  X)  (THE  TES1DESNS  N) 

(THE  CAUSALNET  CN) 

( (SOME  TCPIEVSLTEST  Y) 

((0  ancestor  Y)  V  ( a)  is  Y)  ) 

(Y  t op  Is veltes tof :c ausalnet  CC) 

(C  •»  C.i  IK-i.  1  po  ae  1  X)) 

(I-,  tisstuesnsot  X)  (X  tostdesns  ») 

(((a)  ancestor  NIL)  (M  testtypu  WIPEOUT) 

(H  previoustest  NIL) 

( M  neqati vene xttestof  NIL)) 

V 

( {-«  (5  ancestor  NIL)) 

(((THE  **  Q) 

( o>  ptevioustost:currenttests:el  em  2) 

(Q  testtype  WIPr.  UT, 

(  (Q  test  result  YES)  (O  iiexttest  M)  V 
(0  test result  NO) 

(Q  neqativenexttest  il)  )  ) 

V 

((THE  STRATEGY  S)  (N  strategies  S) 

(S  intluonces  TES1E) 

(  (  (S  is  N  INCOST) 

(  !1  c  loroot 

(SHIN  (  (T'ESTDL  SN  NN)  |  ( NI.  elewof  N)  ) 

cost )  )  ) 


Aid*.!1  DEFINITION  IN  MDS 


Page  1.12 


V 

((S  is  MAX  WEIGHT MIN COST) 

(M  af fectedstate 
(SMAX 

((**  Z)  | 

(Z  iinsta nceof ; iinstanceof 
STATSDESN) 

(CN  states  Z) 

(  (SMIN 

(  (TE  STDESN  NM)  | 

{NM  elemof  N)  ) 
cost) 

af fectedstate  Z) ) 
likelihood  : probability) )  ) 

V 

((S  is  MAX  WEIGH  T COS TP  AT 10) 

(M  elemof 

(SKAX  (  (TESTDES  N  NQ)  |  (NQ  elemof  N)  ) 
cost  ratio)  )  ) 

V 

((S  is  MAX  WEIGH  T) 

(M  af fectedstate 
(SMAX 

(  (ST ATEDESN  ST)  | 

(ST  elemof : sta tesoi  CN)  ) 
likelihood tprobabiiity) )))))))))) 
TOPLEVELTEST  curr enttests) 

1  dd-I  OPLSVEL  IEST-selectedtests-argdn<1^H 
(si  SCC : 

(Q  COTE 

((**  X)  | 

((a)  currenttests  :#  1)  (i  cur centtest s; elom  X) 

V  (i  selectedtests  X) 

(3  currenttests; elem  X) )  ) ) 
TOPLEVELTEST  selectedtests  argdn  1) 

5-  UP  LE VELTSST-nexttest-argdnk 1> IT 
U  SCC; 

(Ql'OTE 

((**  TT)  |  ((S  currenttests  NIL)  (TT  is  NIL))  V 

(TT  is  TOPLEVELTEST)  )  ) 

TOPLEVELTEST  nexttest  arqdn  1) 


l  i’  DN  :  (COMMENTS  $L) 

(ELEMDN  (0  *  COMMENT)  ) 

( (diagnosis/therapyof  V)  (DISEASEDES NS  $L) 

diagnosis/therapy) 

( (commentsof  V)  (ENTEYDEFNS  $L)  comments)) 


(i  DN: 


(CAUSALNET  SN) 

(causa  Lnetof  (DISEASEDESN  $N)  causalhet) 
(startingstates  **  startingstatesot  CC42) 


JAjUI.*  DEFINITION  IN  MD3 


page  1.13 


(pathways  (IT  PATHWAY)  pathwaysor) 

((causes  !)  (IT  (?  CC41))  causesot) 

(te  rininalsta  tes  **  termina  lstatesoi  CC43) 
(interiorstates  **  interiorstatesor  CC44 ) 

(ml start inqstates  ( KL5T A RT 1NSST AT £S  (?  CC4b)  ) 
mlstartinqstatesof ) 

((states  !)  (IT  (?  CC40) )  statosoi:)) 


J4^ 


C  a  USA LN ZT -start inqstates 
( i.  3CC: 

(Q  UOT  E 

((**  S)  | 

((3  iinsta nceot :  star ti  nqstat esoi : ole  m  u,)  V 
((SOME  x) 

(x  iinstanceof  :iinst  a  nceot  STATiiDESN) 

(d  star tingstate  x)  (x  causes:  state  S) 

(x  status  CENIF.D)  (-•  (3  status  DENIED)))))) 

CAUSALNET  startinqstates) 


J4l 


-  J.-i  USALNET-causes-arqdn<  1>IT 
(^  SCC: 

(Q  UOT  E 

(  (TEMPLATES  X)  | 

(3 


causa  lnetof:  causalraodel :  causal  net  let  &:  causedesns  x 
CAUSALNET  causes  argdn  1) 


.4  J-C  aUSALN  ET-terminalstates 
( v see : 

(QUOTE 

((**  S)  | 

O 


ijiioaiiic'.oiici  usa  iiiiouol:  causa  In  etdetn  :  terminals  rate  s:  ai  us  tance  S 

CAUSALNET  terminalstates) 

44-JaUSALN  ET-interiorstates 
(QSCC: 

(QUOTE 

((**  S)  | 

(3 


ausi  x  ti et. ot  :c.i .  -t  1  mod  :  causal netdefn  :  in  te r  lots ta  t«s:ii  n stance  5 
CAUSALNET  interiorstates) 

:-t  j-CaUSALN  ET-mlstart.  inqstates- a  rgdn<1>ML  ST  Ah  TlMGSTATi.S 
(wscc:  (QUOTE  ((CAUSALNET  C)  |  ( (D  is  C)  )  )  CAUaA  LNET 

pi)  starti  aqstdtes  arqdn  1) 


ASiUa  DEFINITION  IN  MDS 


Page  1.14 


4  0-La  USALNET-states-argdn<1>IT 
(^SCC: 

(U  DOTE 

((**  X)  | 

(a 

causalnetof icausalmodel :ca usalnetdef n; statedesns  X 
CAUSALNET  states  argdn  1) 


(  TDN  :  (QUESTION  TS) 

( (summary guestionof  V)  (TESTDESNS  $l) 

summaryquestion) ) 


(IDN:  (STRATEGY  BN) 

(influences  (Ii.FL'JEN’CE  3’*)  influencesof ) 

(  (inf  lu^nceabyct  V)  (ST100)  SI)  ii.iiuence^n/)  ) 


(TON:  (COUNTER  TI) 

((counterof  V)  (TESTDESNS  SL)  counter)) 


(TUN:  (ST  1005  $L) 

(ELEUDNS  (0  *  ENTHIDEFN)  (0  *  STATEDESN) 
(0  *  HOW))) 


l I DN :  (EFFECT  $N) 

((how  ! )  (HOW  IA)  reasonfor  CC3fa) 

( (affectedstate  !)  (STATEDESN  H N) 

aft ectedstateof ) ) 


3o-df“FECT-how 
U  SCC: 

(QUOTE 

((HOW  H)  |  (a)  how  H) 

(  (H  is  CONFIRMED)  V  (H  is  DENIED)))) 
EFFECT  how) 


U’DN:  (YESNO  TA) 

(  (cepeatdbliityof  V)  (TESTDESNS  $L)  repeatability) 
(  (t  .utresultof  V)  (TESTDESNS  SL)  tostresult) 

( (negati  veueterminancyof  V)  (TESTDESNS  SI) 

negati veuotoL'minancy) ) 


(i’DN:  (COST  T#) 

(  (COStOt  V)  (TESTDESNS  SL)  cost)) 


(TON:  (CONFIDENCE  TI) 


A3N  ii£  DEF I  MIT  ION  IN  KDS 


Page  1.15 


( (conf idenceof  V)  (TESTDESNS  $L)  con fiae nca)  ) 


(I’DN:  (TESTTYP  E  RN) 

(  (testtypeof  V)  (TESTDESNS  $L)  testtype)) 


( X  D  N  :  (COSTR  ATIO  ?#) 

(  (costra  tiooi  V)  (TESTDESNS  31)  c~>stratio)  ) 


( I DN  :  (ST  10  11  $L) 

(ELEHDNS  (0  *  CAUSALNETDEFN)  (0  *  PATHWAY))) 


(TDN:  (CAOSEDESN  MN) 

((state  ! )  (STATEDESN  MN)  stateoi) 
((transitionprob  !)  (PROS  T#)  transi tion probof 

CC 19) 

((causescf  V)  (STATEDES  NS  $1)  causes)) 

1 9-C  A  U  SEC ESN- transitionprob 
U  SCC: 

(QUOTE 

((PSOE  P)  | 

(-»  (3  3tate:terminalstatesoi:causedo3ns  <t) 
(P  #: 0)  (P  #:=<  1)  ) 

V  (P  #:=  0) )  ) 

CA USEDESN  transitionprob) 


(TDN:  (CAUSALNETDEFNS  $L) 

(ELEMDN  (0  *  CAUSALNETDEFN))) 


tJl)N:  (ST1012  tl) 

» ~  I  ’  I;  '»!  S  H*  ♦  S A  :  r  ■ :  ;•  S N  )  (0  *  ENTEYDEFN)  )  ) 


(TDN:  (STATEDESN  MN) 

((conflict  c)  (CONFLICT  TI )  conilictor) 
((presence  C)  (PRESENCE  TA)  preseucoof  CC13  TF2) 
((likelihood  C>!)  (IT  LIKELYHCOD)  liK.elinoo.lcf) 
Ma fiectedstuteof  V)  (EFFECTS  3L)  ar fectedstate) 
i  (entrystateof  V)  (ST1002  SL)  entrystuto) 

(  (s'  a  to  of  V)  (CAUSEDESNS  S  L)  stat.) 

{ (startingweignt  !)  (PKOB  I#)  st.n  tinq  weiqntof 

CC7) 

{(descendants  X)  (STATEDESNS  $L)  ati.eend en tso f 
CC  8) 

((causes  !  )  (CAUSEDESNS  $L)  causes  of  CC  1  0) 
((threshold  !)  (IliUESiiOLD  r a)  ti.re:,hui<ioi  cC(>7) 
((status  C)  (STATUS  TI)  statusoi  cell)) 

1  J-  ->T  ATEDESN-prosence 


ASNiil  DEFINITION  IN  MDS 


Page  1.1b 


(Q SCC: 

(Q  COTE 

((PRESENCE  P)  | 

((3  status :#:>=: #of : thresholdor  a)  -> 

(P  is  CONFIRMED)  ) 

((2  status: # ;  =  <  (MINUS  (3  threshold:  #)) )  -> 

(P  is  DENIED) ) ) ) 

STATEDESN  presence) 

7 -STATEDESN- starting  weight 
(wSCC: 

(QUOTE 

((PROB  P)  | 

((3  starting  weight  P) 

(((d)  startmgstatesof  :stateuesns  u»)  v 
(3  desiynatedsta tescf: statedesns  a.)) 

(P  #:>=  0)  (P  #:=<  1) ) 

v  (P  #:=  0)))) 

STATEDESN  startingweight) 

B-aTATEDESN-descen dents 

USCC:  (QUOTE  ((STATEDESN  S)  )  (3  causesistate  S)  )  ) 

STATEDESN  descendents) 

1  0-Si ATEDESN-causes 
iviSCC: 

(QUOTE 

(  (CAUSEDES N  C)  |  (3  causes  C) 

(C  causedesnsof: statedesns  3))) 

STATEDESN  causes) 

67 -STATEDESN- threshold 

( qscc: 

(QUOTE 

((THRESHOLD  T)  | 

((3  threshold  T)  V 

(3  statedesnsof  icoirmcnthreshold  T) )  ) ) 
STATEDESN  threshold) 


1  1  -si’ATSDSSN-status 
(QSCC: 

(QUOTE 

((STATUS  S)  |  (SOKE  K) 

(  (SMAX 

((**  N)  | 

(N  ii nstanceof :i i nstanceor  T^STDLSN) 

(3  afiectedstatecf lelemciietioctsof  N) 
(N  testtype  SQ) 

( ( N  testresult  YES)  V 
( ( N  testresult  NC) 

(N  negativedetermina ncy  YES)))) 
cost) 
elom  M) 

(((M  negativedeterminancy  YES) 

((MINUS  (M  cost:#))  #of  S)) 


ASNEi  DEFINITION  IN  MDS 


Page  1.17 


V  (S  #:  = : #of : costof  M)))) 

STATEDESN  status) 

£  -  SI  11  TED  ESN- presence 
(wSTR: 

(0  UOTE 

(NBT 

( DCOND 
(“• 

((EXISTING  (a)  presence))  is 
(NEW  (a)  presence) )  ) 

(ASSERT 

(a)  presence 

(NEW  (ai  presence)))))) 
((**  s)  (a)  causes:state  S) 

(ASSERT  (S  forwaraweight)  ) ) 

(DCOND 

((NEW  (d>  presence  CONFIRMED)) 

( (**  Q)  (3  causedby: state  Q) 
(ASSERT  (Q  in verseweight) ) ) ) 
(T 

{(**  R)  (a)  causes ‘.state  R) 
(ASSERT  (R  total  weight)  ) ) 
(ASSERT  (d>  inverseweight) ) ) )  ) ) 
STATEDESN  presence) 


(TDN:  (ENTRYDEFN  SN) 

((descendents  $)  (STATSDESNS  $L)  descenaentsof 
CC  17) 

((comments  !)  (COMMENTS  $L)  ccmmentsot) 

((nextentry  I)  (ENTRYDEFN  $N)  nextentryof) 

( (f  irstentryof  V)  (CLASSDEFNS  $L)  tirstentry) 

( (nexte ntryof  V)  (ST1000  $L)  nextentry) 

(  (lowerentries  $XR)  (ENTRXDEFN  iN)  lowerentriesof 

CC18) 

((lowerentriesof  V)  (ENTRYDEFNS  $L)  lowerentries) 
((entrystate  J)  (STATEDESN  MN)  entrystateof  CC70) ) 

1 7 -SN  TRY DEFN- descendents 
( JSCC : 

(Q  UOTE 

((STATEDESN  S)  | 

(d>  lowerentries:  entrystate  S) ) ) 

ENTRYDEFN  descendents) 

1 TRY DEFN- i owe ren tries 

USCC:  (OUOTF  (  (ENTRYDEFN  K)  J  (  i»  nextentry  L))) 

ENTRYDEFN  lowerentries) 

7 J-SN TRYDEFN-entrystato 
(jSCC: 

(y  UCTE 

(  (STATEDESN  S)  | 

((THE  STATED  ESN  I>) 


Ikz&uZ  DEFINITION  IN  MDS 


Page  1.18 


((3  nextentryof :entrystate  R)  -> 
(R  descendents :elem  S))})) 
ENTRYDEFN  entrystate) 


(i’DN:  (CLASSNAME  RN) 

( (classtypeof  V)  (CLAS5DEFNS  fL)  classtype)) 


(iDN:  (LIKELYHOOD  $N) 

(probability  (PROB  T#)  prof abilit yof  CCb3) 
(forwardweight  (FRCB  T#)  f orwardweiyhtof  CC54  TR  3) 
(totalweight  (PROS  T#)  totalweightof  CC55  TR4) 

( (inverseweight  D)  (IT  CCNEEBCB)  inverseweightof 

CC91  TR5) 

(totalinverseweight  (PROBS  $L) 

totalinverseweightof  CC57  TR6) ) 

bJ-LIK  EL Y HOOD- pro Lability 
(viSCC: 

(QCOTE 

(  (PROB  P)  | 

(P  #:  is 
(MIN  1 

(MAX  (at  forwardweight:#) 

(a  totalinverseweight:#)))))) 

LIKELYHOOD  probability) 

:  5 4- LIKELYHOOD -for ward weight 

(*  see: 

(Q  UOTE 

((PROB  P)  |  (THE  **  X) 

(X  iinstanceof :iinstanceof  STATEDESN) 

(X  likelihood  a) 

(P  = 

(ADD  (X  startingweight :#) 

(SUM 

(  (PPCB  C)  I  (THE  **  Y) 

(Y  iinstanceof : iinstanceci  CAUSEDESN) 

(Y  state:causes :state  X) 

(Y  sta te; presence  ?) 

(Q  is 
(PRODUCT 
(V 

state:  likelihood  :  f  orwar  dweight:  #) 
(Y  transit  ic?  ptob  :#)))) ) 

(SIJM 

((PROB  Q)  |  (THE  **  Y) 

(Y  iinstanceof  :  iinstanceof  CAUSLDESN) 

(Y  state: causes :state  X) 

(Y  state:presence  CONFIRMED) 

(Q  transitionprobof  Y)  )))))) 

LIKELYHOOD  forward  we iyht) 

b5-i.o.K  ELY  HOOD- total  weight 


CAoNEi'  DEFINITION  IN  KDS 


Faye  1.19 


(ySCC: 

(0  HOT  E 

((PROB  P)  |  (THE  **  X) 

(X  iinstanceof: iinsta nceof  STATEDESN) 

(X  likelihood  <2) 

(P  = 

(ADD  (X  startin qweight : #) 

(SUM 

(  (PROii  Q)  |  (THE  **  Y) 

( Y  iinstanceof riinstanceor  CAUSEDESN) 

(Y  state: causesrstate  X) 

(-•  (Y  stat e: presence  DENIED)) 

(Q  is 
(PP0DUC1 
(* 

state:  like  li  hood ;  forwaruweight:  #) 
(Y  transitionproL: #)))))))) ) 
LIKELYHOOD  totalweiyht) 

2  2  y  K  ELY  HOOD- inverse*  eight 

i  v)  SCC : 

(QUOTE 

((CONEPROB  C)  | 

(  (j>  likelihccriot : presence  DENIED)  -> 

(C  is  NIL)))) 

LIKELYHOOD  inverseweiqht) 

225 IK  ELY HOOD -total in verse  weight 
(QSCC: 

(Q  UOTE 

((PROB  P)  | 

(P  is 
(SMAX 

( (PEOB  P)  |  (CONDFBOB  C) 

(C  causestat  e:  likelih oca  u  ) 

(C  probability  P)) 

HIL)))) 

LIKELYHOOD  totalin verseweight) 

ia3-DiKELYHOCD-forward weight 

(QSTR : 

(QUOTE 

(NOT 

((**  S)  (S  causesrstate  S)  (S  presence  ?) 

(ASSERT  (S  forwardweiyht)  ) ) 

(DC  ONI) 

(  (-»  (u)  presence  DENIED)) 

(  (**  0)  ( ti)  causesrstate  y; 

(ASSERT  (Q  tot.ilwexy  lit )  )  )  )  ) 

(FORCE  ( a)  probability)))) 

LI  KELYUOOD  forwardweiyht) 

k  FI.YliOOE- total  weight 
(ySTR: 

(QUOTE 


CASHEL  DEFINITION  IN  MDS 


Paqe  1.20 


(NBT  (ASSERT  ( a)  in verseweiqht )  ) 

(DCOND 

( ('i  presence  CONFIRMED) 

((**  S)  (a>  cause  dby  :st  at  e  S) 

(ASSERT  (S  inverseweiyht)  ) ) ) )  )  ) 
LIKELYHOOD  totalweiyht) 

r d 5- Li KE I YHOCD-in verse  weight 
UST?: 

(QUOTE 

(NBT 

(  (**  S)  (a>  causedby :  state  S) 

(ASSERT  (S  in  versewe  ight)  ) ) 

(ASSERT  (ai  totalin verseweiyht)  ) ) ) 
LIKELYHOOD  inverseweiyht) 

ia a -LIKELY HOOD- total in verse weight 

{^STR:  (QUOTE  (NET  (FORCE  (a)  probability))))  LIKELYHOOD 
total in verse we ight) 


(IDN:  ( F.NTR YDEFNS  $L) 

(ELEMDN  (0  *  ENTEYCEFN)  )  ) 


(i'DN:  (COMMENT  *N) 

({diagnosis  !)  (STATEMENT  IS)  diaqnosiso . ) 
((therapy  I)  (S't  ATIr.SN  J  TS)  therapyot)  ) 


(TDM:  (PATHWAY  S  N) 

(components  **  ccmponentsof  CC49) 

(startinqstate  **  startingstateof  CC48) 
(nextpathway  (IT  (?  CC66) )  nextpathwayot ) ) 

CC49-?  ATHWAY-components 
(QSCC: 

(QUOTE 

(  (**  S)  |  (S  iinstanceof siinstanceof  STATEDESN) 
(«3  startingsta te:desce nde nt  S) 

((SOME  **  x) 

(x  iinstanceof:  ii  nst  anceor  STATEDESN) 

(3  startingstate:descendent  x) 

(x  presence  DENIED)  (x  descendant  S)  )  ) 

((**  U)  (U  iinstanceof : iinstanceof  STATEDESN) 
(i  startingstat eidescendent  U) 

(U  descendent  S)  {'J  presence  CONFIRMED)  )  )  ) 
PATHWAY  components) 

2  2  ‘♦d-t*  a  Til  WAY-  star  tiny  state 
( qSCC: 

(QUOTE 

((**  S)  |  (S  iinstanceof  siinstanceof  STATEDESN) 
(<j>  pathwiysof  ;mlsta  rtinystates  S) 


AS Ufi.1'  DEFINITION  IN  MDS  Faye  1.21 


( (P  AT  Hi*  A  Y  P)  (P  startingsta te  S)  -> 

(2  is  P)  )  ) ) 

PATHWAY  st  artinqstate) 

bo -PATHWAY-next pathway- a rqdu<1  >IT 
t^SCC: 

10  OOT  E 

((TEMPLATE  P)  | 

((((**  S)  (a)  pathwaysof  :  tn  1st  artinqst  ates  S) 
((SOME  PATHWAY  Q)  (Q  star tiugstat e  S )  )  ) 
(P  is  NT LJ  ) 

V  (  P  is  PATHWAY) ) )  ) 

PATHWAY  nextpathway  argdn  1) 


(TDN:  (INFLUENCE  R  N) 

(influencedby  (STRATEGY  RN)  inf luencedbyof) 

( (inf luencesof  V)  (STRATEGIES  SL)  intluences) ) 


(TDN:  (ST1003  $L) 

(ELLMDNS  (0  *  INFLUENCE)  (0  *  LIK  ELYHOOD)  )  ) 


(TDN;  (HOW  TA) 

((reasonfor  V)  (EFFECTS  $L)  how)) 


ITDN:  (PROS  T#) 

( (totalweightof  V)  (ST1000  $L)  totalweiqht) 

( (forwardweightof  V)  (ST1000  $L)  t cr wardweight) 
( (transitionprobof  V)  (CAUSEDESNS  3L) 

tran  sit io  nprob) 

( (startingweightcf  V)  (STATEDESNS  $1) 

startingweiqht) 


( (probabilitycf 

V)  (ST1001  SL) 

probabili  ty) ) 

( TDN  : 

(CONFLICT  TI) 
((conflictcf  V) 

(STATEDESNS  SL) 

conflict) ) 

(TDN; 

(PRESENCE  TA) 
{(presonceof  V) 

(STATEDESNS  $L) 

presence) ) 

(  T  DN; 

(ST  1002  SI) 

(FLEMDNS  (0  *  LIKEIYHOOD)  (0  *  ENTRY  LEFN  )  )  ) 


(TON:  (STATUS  TT) 

( (statusof  V)  (STATEDESNS  $L)  status)) 


li’HN:  (CONDPNOD  $N) 


ASNEi'  DEFINITION  IN  KUS 


Page  1.22 


(causestate  **  causestateof  CC93) 

(effectstate  **  ef tectstateot  CC90) 

(nextprob  (IT  CCNDFROB)  nextprobof  CC92). 
(probability  (PPOB  T#)  probabilit y of  CC5b)) 

93-CONDPRO  E-causestate 

iqscc: 

(Q  DOTE 

((**  S)  | 

(((-•  (3  in verseweightof  NIL)) 

(3  in  verse  we ightof : likelihoouof  S)  ) 

V  (3  next piobot:ca usestate  S) ) )  ) 

CONEPROD  causestate) 

90-0  0  NDFRO  E-ef  feet state 

iiSCC:  ‘ 

(Q  DOT  E 

( (**  S)  |  (3  causestate:  descenuents: e  lem  S) 

(S  presence  CONFIRMED) 

(- 

((SOME  **  Q) 

(3  causestate:  descendants:  eleir  Q) 

(Q  descendents  S)  (Q  presence  DENIED))) 

(  (CONDPROB  C) 

((C  causestate: . dusestateof  a) 

(C  effectstate  S) ) 

->  (C  is  3)))) 

CONEPROB  effectstate) 

92-CUN DPROE-nextprob 
USCC: 

(QUOTE 

((CONDPROB  C?)  I  (3  nextprob  CP)  (BOM  E  **  S) 
(3  causestat e: descendents:elem  S) 

(S  presence  CONFIRMED) 

(*’ 

((SOME  **  Q) 

(a)  causestate: descendents: eleui  Q) 

(Q  doscendents  S)  (Q  presence  DENIED))) 
(CONDPROB  C)  ( u>  causestat e: causestat eof  C) 
(-*  (C  effectstate  S)  ) )  ) 

CONDPROB  nextprob) 

So-C UNDPROE-probuoility 

IQ  see: 

(QUOTE 

( (  P  R  0  B  P)  |  (  Til h  **  X) 

(X  iinst  ir.ceof :  iinstancecf  JiATEDESN) 

(3  effectstate  X) 

(P  = 

(DIVIDE 

(PRODUCT  (3  causestato:totalweiyht: #) 
(SUM 

((PROD  Y)  |  (**  2) 

(Z  iinstanceot  :ixnstanceof: 


JAaNEI  DEFINITION  IN  MDS 


Page  1.23 


CAU5EDESN) 

(a)  causestat e : causes  Z) 

(3 

ca  us  estate  tdesct  naeritsor  :  stateof 
Z) 

(((3  causestate:stateor  Z) 

{Y  transiticnprobor  Z) ) 

V 

({THE  CCNDPFOB  C) 

(C  causestate:  stateor  Z) 

(C  effectotate  X) 

(Y  is 

(PPODUCT  (Z  tr ansition piob: #) 
(C  probability: *))))) ) ) ) 

(X  totalveignt: * ) ) ) ) ) 

CONDPECE  probability) 


(TDN: 

(PROBS  $1) 

(ELEMDN  (0  *  FEOE)  J 

( (totalinverseweightof  V)  (IIKELYtiCODS  $1) 

totalinverseweiqht)  ) 

(TDN: 

(ST  1009  $L) 

(ELEHDNS  (0  *  CAUSALNET)  (0  *  APPLICATION))) 

(TDN: 

(APPLICATIONS  $L) 

(ELEMDN  (0  *  APPLICATION))) 

(TDN: 

(CONDPROBS  $L) 

(ELEMDN  (0  *  CONDPROB))) 

(TDN: 

(STATEMENT  TS)  ) 

{ i.  b  N : 

(ST  1 001  $  L) 

(ELEMDNS  (0  *  CONDPRCB)  (0  *  LIK  EL YHOO D)  )  ) 

(TDN: 

(LIKELYHOCDS  $L) 

(ELEMDN  (0  *  LIKELYHOOD)  )  ) 

n'DN; 

(CLl'SSNAMES  $L) 

(ELEMDN  (0  *  CLA S5NA  ML) ) ) 

(TDN: 

(PATHWAYS  AL) 

(ELEMDN  (0  *  PATHWAY))) 

(i  dn: 

(CAUSALMODELS  $L) 

Cttiiii DEFINITION  IN  MDS 


Page  1*24 


(ELEMDN  (0  *  CAU5ALMCDEL) ) ) 


;  i'DN  :  (CA  USA LNETS  $1) 

(ELEMDN  (0  *  CAUSA  LNET)  )  ) 


(IDN:  (MINUS  $  F) 

( FNDEF  TUPLE  (  (  FLO ATP  FIXP)  NIL) 
(  (FLOATP  FIXP)  NIL))) 


(IDN:  (ASKQUESTION  $F) 

(FNDEF  TUPLE  ((QUESTION)  Nil)  (  (YESNO)  NIL))) 


( IDN:  (SMIN  $F) 

(FNDEF  TUPLE  (  (LISTP)  NIL)  ((RELPATH)  CC56) 

((**)  CCl fa  1 ) ) ) 

C5d- JM IN  <2  > 

(*/SCC: 

(QUOTE 

(  (HELPATH  E)  |  (a)  arg2  R)  (X)  ( a)  arq1:elem  X) 

(DIM NC  K  X  B)  )) 

SMIN  argdn  2) 

21o1-SMIM<.3;> 

iwSCC:  (QUOTE  ( (**  A)  j  (A  elem :e lemof : arg  1  or  5))))  SMIN 
argdn  3) 


(i’DN:  (SMAX  $F) 

(FNDEF  TUPLE  ((LISLP)  NIL)  ((RELPATH)  CC59) 

((**)  CC160)  )  ) 

JaJ-juA  X<2> 

(QSCC: 

(QUOTE 

((RELPATH  R)  |  (<fl  arg2  R )  (X)  ( <£  arq1:elem  X) 

(DIMNCK  X  R)  ) ) 

SMAX  argdn  2) 

C  lo0-SMAX<s3> 

(*5CC:  (QUOTE  (  (*♦  A)  |  (A  elem  :eleof  :arg  lot  a))))  SMAX 
argdn  3) 


(iDN:  (SUM  $F) 

(FNDEF  TUPLE  ((LISTP)  NIL)  {  (**)  CC60))) 

Jod-od M<2> 

(q  SCC: 

(QUOT  E 

( (*♦  D)  |  (j>  arg2  D)  (THE  TI  x) 


JAJjLI  DEFINITION  IN  MDS 


Page  1.25 


((X)  (a)  argljGlem  X)  (X  iinstanceof  x) ) 

(D  iinstanceof  x))) 

SUM  arqdn  2) 


(IDN:  (A3KSEX  $F) 

(FNDEF  TUPLE  ((GENDER)  NIL))  (sexol  (PEOPLE  3L)  sex)) 


(TEN:  (ASKYEARS  $F) 

(FNDEF  TUPLE  ((YEARS)  NIL)) 
(aqeof  (PEOPLE  $L)  age)) 


(IDN:  (DATE  $F) 

(FNDEF  TUPLE  {(SIRINGP)  NIL)) 
((dateof  V)  (DISEASEDESNS  $L>  date)) 


(IDN:  (ML  STAR  TIN  GST  A  TES  $F) 

(FNDEF  TUPLE  (  (CAUSALNET)  NIL)  (  (**)  CC47)) 
( (nlsta rtinystatesof  V)  (ST1CC9  $L) 

mlstartingstates) ) 


Ji4/“JiijSTARTINGSTATES<2> 

(Q  see: 

(QUOTE 

((**  S)  | 

(S  iinstanceot: iinsta • cec£  STAIEDESN ) )  ) 
MLSTARTINGSTATES  arqdn  2) 


(TDN:  (ADD  $F) 

(FNDEF  TUPLE  ((FIXE  FLOATP)  NIL) 

((FIX?  FLOAT?)  NIL)  ((FIXP  FLOATP)  NIL) 
(  (FIXP  FLOATP)  NIL)  ) ) 


(TDN :  (PRODUCT  $F) 

(FNDEF  TUPLE  ((FIXP  FLOATP)  NIL) 

((FIXP  FLOATP)  NIL)  {(FIXP  FLOATP)  NIL))). 


(IDN:  (DIVIDE  $F) 

(FNDEF  TUPLE  ((FLOATP  FIXP)  NIL) 

((FLOAT?  FIXP)  NIL)  ((FLOATP  FIXP)  NIL))) 


( i  DN  :  (ADS  fF) 

(FNDLF  TUPLE  ((FLOATP)  NIL)  (  (FLOAiT)  NIL))) 


(IDN:  (MIN  $  F) 

(FNDEF  TUPLE  ((FIXP  FLOATP)  NIL) 

((FIX?  FLOATP)  NIL)  (  ( K I X  F  FLCAiVj  ML))) 


JaoNLi  DEFINITION  IN  MDS 


Faqe  1 . 2o 


(TDN : 


(  TDN: 
(TDK: 


(MAX  $F) 

(FND2F  TUPLE  (  (FIXP  FLOATP)  NIL) 

(  (FIXP  FLOATP)  NIL)  ((FIXP  FLOATP)  NIL))) 


( YE  APS  TI) ) 
(GENDER  R  N)  ) 


SOSAP-TR-30 


February  1977 


HEREDITARY  -  LOCK  RESOLUTION:  A  RESOLUTION  REFINEMENT  COMBINING  A 
STRONG  MODEL  STRATEGY  WITH  LOCK  RESOLUTION 

D.  M.  Sandford 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

New  Brunswick,  New  Jersey 


This  research  was  supported  by  the  Advanced  Research  Projects  Agency 
of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6  to  the 
Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U.  S.  Government. 


CONTENTS 


Acknowledgments . 

Chapter  1 

1 . 0  Orientation . . . 

1.1  Conventions  and  Abbreviations . 

1.2  Lock  Resolution . . . 

1.3  The  Model  Strategy  and  Semantic  Resolution.. 

1.4  The  Intersection  of  Lock  Resolution 

and  The  Model  Strategy . 

1.5  A  Completeness  Proof  of  The  Model  Strategy 

Using  Lock  Resolution . 

Chapter  2 

2.0  Models  and  Resolution  Searches . 

2.1  Models  for  Use  with  Semantic  Strategies: 

A  Specific  Example . 

2.2  The  Connection  Between  A-Models  and  Herbrand 

Inter pr etations . 

2.3  A  Defect  in  The  Model  Strategy . 


Chapter  3 


3.0  Hereditary-Lock  Resolution .  26 

3.1  HL-Resolution :  An  Informal  Description .  26 

3.2  An  HL-Resolution  Example .  38 

3.3  Some  Comments  Concerning  HL-Resolution  Searches  and 

Other  Resolution  Searches .  47 

3.4  Definition  of  the  Basic  HL-Resolution  Refinement 

Strategy .  53 

3.5  Soundness  and  Completeness  of  HL-Resolution .  63 

3.6  Evaluation  of  the  HL-Resolution  Strategy .  85 

3-7  Extensions  of  the  HL-Resolution  Strategy .  88 


Chapter  4 

4.0  Summary .  93 

REFERENCES .  94 

APPENDIX .  96 

INDEX . 106 

TABLE  OF  ABBREVIATIONS . 108 


iv 


ACKNOWLEDGMENTS 

I  would  like  to  extend  sincere  appreciation  and  thanks  to 
Professor  C.  V.  Srinivasan,  who  provided  the  initial  impetus  and 
continued  encouragement  for  the  writing  of  this  report,  and  to 
Professor  A.  Yasuhara,  who  has  read  and  offered  much  needed 
suggestions  for  improvement  in  the  presentation  of  this  material. 


1.0 


Page  1 


1 . 0  Orientation 

This  report  assumes  the  reader  is  familiar  with  the  techniques 
of  resolution  theorem  proving  (Robinson,  1965)  for  the  first  order 
predicate  calculus.  This  background  information  is  most  easily 
obtained  by  reading  Chapters  6,  7  and  8  of  Nilsson,  and  Chapters  1 
through  6  of  Chang  and  Lee,  (Nilsson,  1971)  (Chang  and  Lee,  1973). 
The  terminology  used  in  this  report  is  consistent  with  these  two 
references  except  where  explicitly  defined  differently. 

Chapter  1  of  this  report  sets  the  context  in  which  to 
understand  the  results  and  viewpoints  stated  in  Chapters  2  and  3. 
Chapter  2  develops  the  notion  of  a  model  in  a  somewhat  more  general 
framework  than  is  typically  done  in  resolution  theorem  proving. 
Chapter  3  states  and  explains  the  main  result  of  this  report,  and 
this  result  can  be  understood  independently  of  the  particular  views 
on  models  stated  in  Chapter  2. 

The  main  result  of  this  report  is  that  the  syntactic 
resolution  strategy  known  as  Lock  Resolution,  and  the  semantic 
resolution  strategy  known  as  The  Model  Strategy  can  be  combined 
into  a  single  sound  and  complete  resolution  refinement  strategy. 
This  refinement  is  called  Hereditary-Lock  Resolution 
(HL-Resolution) .  Although  this  strategy  is  a  very  strong 
refinement  strategy  in  its  present  form,  it  is  felt  that  its 
primary  value  lies  in  future  extensions  of  the  method  based  upon 
information  available  in  HL-Resolution  searches.  Such  information 
is  usually  not  available  under  other  resolution  strategies. 


1.1  Conventions  and  Abbreviations 


1.  A  resolution  step  will  refer  typically  to  the  resolution 
of  just  two  parent  clauses.  This  corresponds  to  binary 
resolution  (Chang  and  Lee,  1973).  Factoring  will  sometimes  be 
considered  as  implicit  factoring,  and  other  times  as  explicit 
factoring.  Clauses  will  sometimes  be  considered  as  sets  of 
literals,  and  are  written  with  the  literals  separated  by 
commas,  and  with  a  semicolon  to  indicate  the  end  of  the 
clause.  The  same  notation  will  also  be  used  when  it  is 
necessary  to  consider  clauses  differently  (e.g.  as  lists), 
and  in  those  contexts  where  it  is  not  explicitly  stated  how  to 
consider  the  clauses,  the  reader  is  free  to  make  his  own 
choice . 

2.  An  unsatisfiable  set  of  clauses,  S,  is  said  to  be 
minimally  unsatisfiable  iff  every  proper  subset  of  S  is 
satisfiable. 

3.  In  Chapter  2  the  word  model  is  used  to  signify  a  set  of 
Herbrand  interpretations.  If  this  set  contains  only  a  single 
Herbrand  interpretation  then  it  corresponds  exactly  to  what  is 
called  a  model  in  the  resolution  theorem  proving  literature. 
Chapters  1  and  3  are  best  understood  by  using  the  word  model 
to  signify  a  single  Herbrand  interpretation. 


Page  3 


4.  The  phrase  "trivial  model"  will  be  used  loosely  to 
describe  a  Herbrand  interpretation  which  assigns  a  truth  value 
to  literals  based  on  the  literal  letter  (i.e.  the  predicate 
letter  plus  the  negation  sign  if  present),  and  based  on  none 
or  relatively  few  of  the  terms  that  appear  in  that  literal. 
Thus  the  models  for  hyperresolution  are  classed  as  trivial 
models . 

5.  The  arrow  "=>"  is  used  to  indicate  correspondence  between 
constructs  in  two  different  languages,  with  the  item  on  the 
left  being  interpreted  as  the  item  on  the  right. 

6.  The  phrase  "singly  connected"  is  used  in  the  sense  of  (Wos 
et  al,,  1967). 

7.  The  phrase  "normal  resolution"  is  used  to  indicate 
unrestricted  (i.e.  unrefined)  resolution  in  those  contexts 
where  it  would  otherwise  be  unclear  (and  be  of  concern)  as  to 
which  strategy  is  being  referred  to.  Similarly  for  "normal 
clause"  and  "normal  literal".  Likewise  a  phrase  such  as 
"normal  lock  literals"  is  used  to  denote  the  usual  literals 
used  in  Lock  Resolution. 


8.  The  following  abbreviations  and  notational  conventions 
will  be  used: 


CNF 


.FA. 


HI 

HLR 

LC  (LM) 


LIP 

LR 

M 


SC 

SR 


.TE. 


TMS 

TSP 

•BOX* 


@ 


Conjunctive  Normal  Form 
The  quantifier  "for  all" 
Herbrand  interpretation 
Hereditary-Lock  Resolution 
Language  of  the 
clauses  (of  the  model) 

Local  Interaction  Problem 
Lock  Resolution 
Model  (usually  meaning  a 
collection  of  Herbrand 
interpretations) ,  or  a  model 
evaluation  function 
Sentential  Calculus 
Semantic  Resolution 
The  quantifier  "there  exists" 
The  Model  Strategy 
Term  Substitution  Problem 
The  empty  list  (or  set) 
of  literals 

subtraction  or  set  difference 
negation  sign 


The  above  listing  is  reproduced  as  the  last  page  of  this  report. 


I 


Page  5 


1.2  Lock  Resolution 


Lock  Resolution  (Change  and  Lee,  1973)  (Boyer,  1971)  is  a 
purely  syntactic  refinement  strategy  for  unrestricted  resolution, 
in  which  literals  in  the  input  set  are  assigned  integer  lock 
numbers  in  any  arbitrary  way.  The  refinement  is  to  allow  two 
clauses  to  resolve  only  on  literals,  in  each  clause,  which  are  of 
the  lowest  lock  number  to  appear  in  that  clause.  The  lock  number 
of  a  literal  in  a  resolvent  is  the  same  as  the  lock  number  of  its 
parent  literal.  When  factoring,  the  literal  eliminated  is  the  one 
with  the  higher  lock  number.  Lock  Resolution  (LR)  is  a  complete 
refinement  of  unrestricted  resolution. 

An  example  of  an  unsatisfiable  sentential  calculus  (SC)  clause 
set  with  the  literal  lock  numbers  written  as  the  second  component 
of  an  ordered  couple,  and  the  sentential  letter  as  the  first 
component,  is: 


1.  < A , 1 > , <  B , 2> ; 

2.  <C,3>,<@A,4>; 

3.  <@B , 5> , <D , 8> ; 

4.  <@C,6>; 

5.  <@D , 7> , <@A , 8> ; 

6.  <§D,8>,<A,9>; 


.2 


Page  6 


k 

8 


The  complete  search  space  under  LR  for  this  clause  set  is 
(where  we  write  ixj  =  k  to  mean  that  clauses  numbered  i  and  j  resolve 
to  give  the  clause  numbered  by  k): 


2x4=  7. 
1x7=  8. 
3x8=  9. 
5x9=10. 
6x9  =  11 . 
10x1  = 
11x7=12. 


<@A,4>; 

<  B ,  2  >  ; 

<  D ,  8  > ; 

<@A,8>; 

<  A ,  9  > ; 

a  duplicate 
*B0X* ; 


of  clause  8. 


Lock  Resolution  is  quite  efficient  when  applied  to  (exactly  or 
nearly)  minimally  unsatisf iable  SC  clause  sets.  This  is  a  result 
of  purely  syntactic  properties  of  the  reductio  ad  absurdum  approach 
in  CNF.  LR  is  also  a  strong  restriction  when  applied  to  first 
order  clause  sets  because  it  is  almost  singly  connected.  This 
report  does  not  concern  itself  with  an  investigation  of  the 
underlying  properties  of  LR,  but  merely  uses  the  strategy  as  a 
basis  for  constructing  a  new  refinement  called  HL-Resolution . 


V 


1.3 


Page  7 


1.3  The  Model  Strategy  and  Semantic  Resolution 


Semantic  Resolution  (Slagle,  1967)  and  The  Model  Strategy 
(Luckham,  1968)  are  very  closely  related.  Semantic  Resolution  (SR) 
involves  some  predicate  letter  ordering,  and  describes  its  basi 
resolution  step  in  terms  of  clashes,  and  The  Model  Strategy  (TMi 
does  not.  When  dealing  with  resolution  search  procedures  that  ar 
(or  almost  are)  singly  connected  there  seems  to  be  littl 
difference  between  expressing  the  procedure  in  terms  of  clashes  as 
opposed  to  binary  resolutions.  The  choice  in  this  report  is  binary 
resolution.  This  choice  then  focuses  on  the  literal  ordering  in  SR 
as  the  distinguishing  difference  between  SR  and  TMS ,  and  SR  can  be 
considered  a  refinement  of  TMS.  The  resolution  strategy  we  develop 
in  this  report  adds  a  literal  ordering  to  TMS  which  is  somewhat 
more  restrictive  than  the  ordering  in  SR. 

TMS  requires  that  there  be  a  Herbrand  interpretation  (HI),  M, 
which  can  be  used  to  evaluate  the  truth  value  of  clauses.  M  is 
called  a  model.  In  TMS  a  clause  is  true  iff  every  ground  instance 
of  the  clause  is  true  in  M,  and  a  clause  is  false  iff  it  is  not 
true.  TMS  is  a  refinement  of  unrestricted  resolution  which  does 
not  allow  resolution  between  two  clauses  that  are  both  true  in  M. 


Notice  that  M  can  be  any  HI,  and  that  no  HI  can  satisfy  a  set  of 
clauses  from  which  *BOX*  can  be  produced  by  using  unrestricted 


resolution . 


r 


Page  8 


1 . 4  The  Intersection  of  Lock  Resolution  and  The  Model  Strategy 

Lock  Resolution  and  TMS  cannot  both  be  applied  and  preserve 
completeness.  To  see  this  consider  the  unsatisf iable  example 
clause  set  of  section  1.2.  If  we  choose  the  Herbrand 
interpretation,  M  =  (@A,B,§C,@D)  as  the  model,  we  see  that  no 
resolutions  can  be  performed  which  satisfy  both  TMS  and  LR 
refinements  (according  to  the  given  lock  numbering).  In  general  it 
is  possible  to  choose,  for  unsatisfiable  sentential  clause  sets, 


1.5  A  Completeness  Proof  of  The  Model  Strategy 
Using  Lock  Resolution 


Here  we  use  LR  to  prove  that  TMS  is  a  complete  strategy.  The 
purpose  of  this  is  to  orient  the  reader  toward  thinking  about  a 
simple  connection  between  LR  and  TMS  for  ground  level  clause  sets. 
The  basic  HL-Resolution  strategy,  to  be  presented  later,  is 
primarily  just  an  elaboration  of  this  simple  connection  so  as  to 
make  it  applicable  to  non-ground  level  clause  sets. 

Let  S  be  an  unsatisf iable  set  of  general  clauses.  Then  by 
Herbrand's  theorem  there  exists  a  finite  set  of  ground  clauses,  SG, 
which  is  a  set  of  ground  instances  of  clauses  from  S  (some  clauses 
in  S  possibly  being  grounded  in  several  ways)  which  is  truth 
functionally  unsatisfiable .  Let  M  be  any  HI  for  S.  Then  M  is  a  HI 
for  SG  also.  Let  there  be  LT  instances  of  ground  literals  in  SG 
that  are  true  in  M,  and  LF  false  in  M.  Let  the  L  =  LT  +  LF  ground 
literals  of  SG  be  lock  numbered  from  1  to  L  with  integers,  and  with 
the  true  literals  being  numbered  from  1  to  LT,  and  the  false 
literals  from  LT  +  1  to  L,  with  each  integer  used  exactly  once. 

Then  any  lock  ground  refutation,  R',  of  SG  with  this  lock 
numbering  is  also  a  ground  refutation  under  TMS  with  model  M.  Now 
we  consider  R'  as  a  refutation  in  unrestricted  resolution,  i.e.  we 
ignore  the  lock  numbers,  and  we  see  that  R*  can  be  lifted  in  an 
obvious  way  to  correspond  to  a  general  level  refutation,  R,  of  S. 
This  R  will  also  be  a  refutation  of  S  satisfying  TMS  with  model  M. 
Thus,  since  LR  is  complete  for  an  arbitrary  assignment  of  lock 
numbers,  we  have  shown  that  TMS  is  complete  for  an  arbitrary  HI,  M. 


1.5 


Page  10 


Notice  that  it  may  be  impossible  to  assign  lock  numbers  to  the 
general  level  literals  of  S  so  as  to  make  R  also  a  LR  refutation. 
This  occurs  because  a  single  general  level  clause  in  S  may 
represent  several  distinct  ground  instances  in  SG.  It  may  be  the 
case  that  the  assignment  of  lock  numbers  to  these  several  ground 
instances  must  be  made  in  a  manner  inconsistent  with  a  single 
linearly  ordered  sequences  of  literals  of  the  general  level  clause. 

The  ability  to  prove  the  completeness  of  TMS  so  easily  using 
LR  leads  to  two  immediate  considerations: 

1.  they  are  probably  closely  related; 

2.  LR  seems  to  be  at  least  as  strong  a  refinement  as  TMS. 

Both  of  these  statements  are  quite  true  for  a  minimally 
unsatisfiable  SC  problem,  which  is  exactly  what  the  unsatisfiable 
ground  set  (SG,  above)  refutation  problem  is.  When  dealing  with 
general  clauses  with  infinite  Herbrand  universes  however,  a  major 
(in  fact  generally  the  major)  problem  is  in  generating  the  proper 
type  and  degree  of  instantiation  of  the  general  clauses  so  as  to 
cover  some  unsatisfiable  Herbrand  ground  set.  In  this  task  LR, 
which  is  a  purely  syntactic  strategy  which  does  not  directly  take 
into  account  any  of  the  arguments  of  a  literal,  loses  much  of  its 
advantage  relative  to  TMS. 

After  explaining  some  aspects  of  some  models  that  could  be 


implemented,  a  refinement  strategy  will  be  developed  that  combines 
LR  and  a  stronger  form  of  TMS. 


^.U  -  2.1 


Page  11 


2.0  Models  and  Resolution  Searches 


This  chapter  develops  a  particular  point  of  view  about  what 
constitutes  a  model  to  be  used  in  a  resolution  theorem  proving 
search.  This  view  holds  that  a  model  may  be  specified  through  a 
combination  of  declarative  and  procedural  information,  and  that  the 
declarative  information  need  not  be  fully  determined  at  the  time 
that  the  resolution  search  begins.  This  material  is  presented  in 
sections  2.1  and  2.2  in  the  form  of  examples,  but  no  attempt  is 
made  to  give  a  formal  statement  of  the  concepts  involved.  The 
intention  is  to  develop  a  concept  of  a  model  which  will  be 
compatible  with  such  strategies  as  TMS  and  SR. 

Section  2.3  indicates  a  difficulty  in  the  relationship  between 
a  model  and  the  resolution  search  process  (in  this  case  TMS)  that 
uses  the  model,  and  is  the  immediate  motivation  for  Chapter  3, 
which  presents  the  HL-Resolution  strategy. 


2 . 1  Models  for  Use  With  Semantic  Strategies :  A  Specific  Example 

TMS  requires  that  clauses  be  evaluated  according  to  some 
model,  M.  The  model  will  be  some  HI  for  the  clause  set  under 
consideration.  If  for  every  ground  instance  of  a  clause  at  least 
one  literal  of  the  clause  is  true  in  M,  then  the  clause  is  true  in 
M.  As  far  as  TMS  is  concerned,  it  is  not  necessary  to  identify 
which  ground  literals  are  true,  and  which  false,  nor  to  identify 
which  ground  instances  of  a  clause  are  true  or  false.  Thus  it 
doesn’t  matter  what  method  of  clause  evaluation  is  used,  just  so 


Page  12 


2.  1 

long  as  the  classification  of  clauses  is  the  same  as  would  result 
from  using  some  fixed  HI. 

Two  extreme  examples  of  models  that  have  been  used  or 
suggested  for  semantic  resolution  strategies  are  the  trivial  models 
and  the  ground  case  models. 

The  trivial  models  evaluate  the  truth  of  a  literal  based  upon 
the  literal  being  classed  in  one  of  a  few  simplistically 
recognizable  categories.  The  most  extreme  case  of  this  is  to 
assume  only  two  categories,  e.g.  positive  and  negative  instances 
of  a  given  predicate  letter.  Such  are  the  models  implicit  in  PI 
(and  also  N1)  (Nilsson,  1971)  resolution  and  positive  (negative) 
hyperresolution.  Trivial  models  typically  require  only  a  small 
amount  of  computational  effort  in  their  truth  evaluations  of 
clauses. 

At  the  other  extreme  are  the  ground  case  models,  which  are 
based  on  a  complete  finite  list  of  all  the  ground  instances  for 
each  predicate  letter  which  are  true,  and,  usually,  a  complete  list 
of  function  values  for  each  ground  instance  of  each  function 
letter.  Clearly  such  an  approach  requires  the  use  of  small  domains 
of  individuals.  Even  with  rather  small  domains,  however,  the 
computational  effort  to  evaluate  clauses  can  easily  become  quite 
large  (Henschen,  1975). 

There  are  intermediate  types  of  models.  These  models  may 
contain  some  explicit  information  in  ground  instance  form,  and 
evaluate  the  truth  of  literals  and  clauses,  which  have  ground 


2.1  Page  13 

instances  not  explicitly  listed,  by  some  algorithm.  In  these  cases 
it  is  the  combination  of  the  explicit  ground  cases  plus  the 
algorithm  which  constitutes  the  model.  Such  models  will  be  called 
algorithmic  based  models,  or  A-models. 

An  example  of  an  A-model  will  be  given  for  the  satisfiable 
clause  set  (which  we  call  GAX  for  Group  AXioms). 


1 . 

P(x,y,«(x ,y) ) ; 

closure 

2. 

P( 1  ,x  ,x) ; 

left  identity 

3. 

P(x,  1  ,x) ; 

right  identity 

4. 

P(x,I(x)  ,1); 

right  inverse 

5. 

P(I(x)  ,x  ,  1 ) ; 

left  inverse 

6. 

€P ( x ,y , z) ,@P(z,u,v) ,@P(y,u,w) ,P(x,w,v); 

associativity 

7. 

@P ( x , y , z ) , @P ( x , w , v ) ,@P(y,u,w) ,P(z,u,v) ; 

associativity 

This  clause  set  could  be  part  of  an  unsatisfiable  clause  set 
containing  the  negation  of  a  theorem  about  groups.  In  that  case 
the  interpretation  intended  in  formulating  the  problem  would  be: 

P(x,y,z)  =>  x  and  y  combined  under  the  group  operation 

yields  result  z. 

*(x,y)  =>  the  result  of  the  group  operation  on  x  and  y. 

1  =>  the  group  operation  identity  element. 

I ( x )  =>  the  inverse  of  x  for  the  group  operation. 


2.1 


Page  1*4 


A  possible  A-model  for  GAX  could  be  constructed  by  use  of  the 
sentential  calculus  (SC)  with  the  following  interpretation  of  the 
elements  of  the  language  of  GAX: 

P(x,y,z)  =>  the  statement:  (x  and  y)  iff  z 

*(x,y)  =>  the  expression:  x  and  y 

1  =>  the  sentential  letter:  1 
I(x)  =>  the  sentential  letter:  1 

In  order  to  be  more  definite  about  this  for  the  purposes  of 
this  example,  we  will  assume  that  the  model  manipulations 
themselves  will  be  done  in  CNF  format.  Then  the  A-model  will  be  a 
set  of  sentential  clauses,  which  contain  the  ground  case 
information,  plus  the  algorithm.  The  algorithm  contains  the 
translation  information  for  translating  clauses  into  the  SC,  and  a 
decision  procedure  for  the  SC. 

The  algorithm  must  translate  constructs  in  the  clause  set 
language,  LC,  into  constructs  of  the  model  language,  LM,  in  such  a 
way  that  a  truth  decision  can  be  made.  The  following  is  the  way 
this  is  done  for  the  specific  model  we  are  constructing. 

P(x,y,z)  =>  the  clause  set:  @x,@y,z; 

@z,x; 

@z,y ; 

@P(x,y,z)  =>  the  clause  set:  @x,@y,@z; 

x  ,z; 


y  »z; 


2.  1 


Page  15 


*(x,y)  =>  the  expression:  x  and  y 

1  =>  the  sentential  letter:  1 

I ( x )  =>  the  sentential  letter:  1 


We  assume  (arbitrarily  for  the  purpose  of  this  example)  the 
truth  of  the  literal  ”1",  and  include  it  in  the  model  as  a  clause 
"1;",  thereby  establishing  the  only  piece  of  ground  case 
information.  In  order  to  evaluate  a  clause  in  the  model,  e.g.  the 
clause  GAX.2, 


1 


a 


a 


GAX.2  P  ( 1  ,  x  ,x) ; 

l 

we  see  if  the  negation  of  this  clause,  when  translated  into  the 

language  of  the  model,  is  consistent  with  the  ground  case  clause  '■! 

"I;".  If  it  is  consistent,  then  the  negation  of  this  clause  has  an 

instance  which  is  true  in  the  model,  and  therefore  the  clause 

(unnegated)  has  an  instance  which  is  false  in  the  model,  and  the 

model  evaluation  is  false.  On  the  other  hand  if  the  negation  is  ^ 

inconsistent,  then  the  model  evaluation  for  the  clause  is  true. 


Thus,  the  negation  of  GAX.2,  namely 

@( . FA . x  P ( 1 , x  , x )  ) 

becomes 

. TE . x  @P ( 1 , x  ,  x ) 


- 

I 

»< 


2.1 


Page  16 


and  then  translates  into  the  SC  clause  set: 


< -  ground  case  information 

(SC  clauses 

associated  with 
. TE . x  @P(1,x,x) 


This  is  an  inconsistent  SC  clause  set,  and  so  the  clause  GAX.2 
is  evaluated  as  true  in  this  model.  This  model  will  evaluate 
clauses  1,2, 3, 6  and  7  of  GAX  as  true,  and  clauses  4  and  5  as  false. 
Thus,  e.g.,  clause  GAX. 4  is  translated  as 

§( . FA . x  P(x,I(x),1)  ) 

which  becomes 


. TE . x  §P(x,I(x),1) 

which  becomes 

1 ; 

@x,@1  ,@1 ; 
x  ,  1 ; 
i,i; 

which  is  consistent,  giving  an  evaluation  of  false  for  GAX. 4  in  the 
model.  Notice  that  the  variables,  which  are  now  existentially 
quantified  in  LM,  do  not  have  the  quantifier  explicitly  written  in 
LM.  Since  the  quantifiers  have  the  entire  set  of  sentential 


clauses  in  LM  as  their  scope  this  causes  no  problems. 


2.1 


2.2 


Page  17 


This  report  treats  A-models  in  an  informal  fashion  since  their 
complete  characterization  has  not  yet  been  accomplished.  Because 
of  this  several  extremely  important  details  concerning  the 
relationship  between  A-models  and  the  clause  sets  they  are  modeling 
will  be  ignored  in  this  report. 

There  are  many  other  SC  models  that  could  be  developed  for  GAX 
by  changing  the  ground  case  information  and  by  changing  the 
translation  mapping  from  LC  to  LM.  There  are  also  many  other  types 
of  models  than  just  the  SC  models,  and  the  next  section  will 
mention  some  of  them. 


2.2  The  Connection  Between  A-Models  and  Herbrand  Interpretations 

The  preceeding  section  illustrated  some  features  of  a  specific 
A-model.  Some  features  of  A-models  in  general  are: 

1.  The  model  performs  its  evaluation  based  on  its  algorithm 
(including  the  translation  from  LC  to  LM) ,  the  ground  case 
information  (in  LM) ,  and  the  set  of  literals  being  evaluated 
(in  LC ) . 

2.  The  computation  is  finite,  and  is  comprised  of  testing  the 
consistency  of  a  set  of  statements  in  LM.  This  requires  that 
the  model  be  a  system  with  a  well  defined  notion  of 
consistency,  and  that  there  be  a  decision  procedure  which  is 
efficient  enough  to  make  it  practical  in  a  resolution  search 


process . 


2.2 


Page  18 


3.  It  is  possible  to  modify  the  model  by  changing  either  the 
ground  case  information  or  the  algorithm. 

4.  Universally  quantified  variables  (in  LC)  from  the  clause  to 
be  evaluated  are  all  changed  to  existential  constructs  (in  LM). 
Note  that  there  are  no  existential  quantifiers  in  LC. 

5.  The  process  of  negating  a  clause  to  be  evaluated  eliminates 
the  "OR-ing"  and  replaces  it  with  an  "AND-ing".  Thus  a  clause 
of  k  literals  in  LC  translates  to  a  set  of  k  or  more  statements 
in  LM . 

Notice  that  items  U  and  5  above,  along  with  the  fact  that  we 
are  translating  and  testing  individually  only  one  clause  at  a  time 
from  a  clause  set  in  CNF  strongly  structures  the  task  we  ask  the 
A-model  to  accomplish.  In  particular  we  do  not  need  to  have  the 
notion  of  universally  quantified  variables,  in  LM,  whose  scope  of 
quantification  would  be  restricted  to  the  statements  in  LM  coming 
from  the  clause  we  are  evaluating. 

The  question  arises  as  to  just  what  HI  is  actually  the  one 
corresponding  to  the  evaluations  performed  by  a  particular  A-model. 

The  answer  is  that  in  general  A-models  will  be  using  a  set  of 
Hi's  to  evaluate  clauses.  A  clause  is  evaluated  to  true  iff  it  is 
true  in  each  HI  in  the  set.  As  an  example  of  this  consider  the 
following  SC  A-model  for  the  equality  predicate  with  the 


translation : 


Page  19 


=(x,y)  =>  the  clause  set:  @x,y; 


@=(x,y)  =>  the  clause  set: 


@y,x; 


x  ,y ; 

§x  ,@y; 


a  =>  the  sentential  letter:  a 
b  =>  the  sentential  letter:  b 


and  assume  no  ground  case  information, 


If  we  evaluate  the  unit  clause  =(a,b);,  the  clause  set  in  LM 


a ,  b ; 

@a,@b; 

which  is  consistent,  so  that  the  clause  =(a,b);  is  false. 

Then,  evaluating  the  clause  @=(a,b);,  which  translates  to 
@a,b; 

@b,a; 

we  also  have  a  consistent  clause  set,  thus  yielding  false  for 
@= ( a ,b) ;  . 

Clearly  this  A-model  is  not  evaluating  clauses  correctly 
according  to  any  HI.  What  is  actually  happening  is  that  the 

evaluation  is  according  to  some  class,  M,  of  Hi's.  In  some  of 

these  =(a,b);  is  false,  so  that  it  is  evaluated  false.  In  the 

rest  of  M,  @=(a,b);  is  false,  so  that  it  too  is  evaluated  to 

false . 


Page  20 


2.2 


nn?  thing  that  can  be  done  in  cases  like  this  (i.e.  when  the 
A-model  is  using  a  class  of  Hi’s)  is  to  ensure  that  every  clause 
that  is  evaluated  is  uniformly  true  or  uniformly  false  over  the 
class,  M,  of  Hi's  being  used.  When  encountering  a  clause  not 
uniform  in  truth  in  M,  then  M  must  be  replaced  by  an  M'  c  M,  such 
that  the  clause  is  uniform  in  M'.  M'  would  then  be  used  until  it 
became  necessary  to  choose  an  M"  c  M'.  Thus  successive  clause 
evaluations  would  cause  modifications  of  the  ground  case 
information,  or  the  algorithmic  parts  of  the  model,  or  both. 
Notice  that  once  a  clause  is  evaluated,  its  truth  evaluation  is  not 
affected  by  future  changes  in  the  model,  since  it  was  uniformly 
true  or  false  when  evaluated,  and  the  only  model  changes  permitted 
are  those  that  replace  the  model  class  with  a  subclass  of  itself. 

For  simplicity,  and  definiteness,  we  assume  that  the  only  way 
to  modify  a  model,  i.e.  restrict  M,  is  to  add  ground  case 
information  so  as  to  make  a  clause  that  is  not  uniform  in  M  become 
uniformly  true  in  M'  c  M.  The  way  to  do  this  is: 

1.  If  the  clause  evaluates  to  true  in  M,  no  modification  is 
performed  since  it  must  be  uniformly  true  in  M. 

2.  If  the  clause  evaluates  to  false  in  M,  but  is  not  uniform 
in  M  (see  below),  then  we  try  to  find  various  items  of  ground 
information,  G1,  G2,.  .  .,  such  that: 

a)  The  current  ground  information  is  consistent  with  the 
expression  G1  and  G2  and  .  .  .  : 

b)  The  clause  evaluates  to  true  when  evaluated  by  the  model 
using  the  combined  old  ground  case  information  plus  G1,  G2, 


— * 


2.2 


Page  21 


If  a  and  b  above  are  both  satisfied,  then  the  new  model  is  the 
same  as  the  old  one  but  with  the  ground  case  information  augmented 
by  G1,  G2,  ...  .  We  do  not  elaborate  here  how  to  find  a  suitable 

set  of  ground  facts,  but  mention  that  sometimes  there  will  exist  no 
such  set  of  ground  facts.  In  this  case  the  model  is  left 
unmodified,  and  the  clause  is  evaluated  as  false  in  M.  Such  a 
default  assignment  of  false  to  a  clause  causes  no  problems  (beyond 
a  decrease  in  search  efficiency)  for  the  types  of  refinement 
strategies  discussed  in  this  report  (i.e.  TMS,  SR  and  HLR). 

A  simple  example  of  this  model  updating  process  is  the 
equality  example,  above,  where  now,  when  the  =(a,b);  evaluation 
comes  out  false,  we  augment  the  ground  (in  this  instance  empty) 
information  set  with  the  information  in  the  clause,  namely 

@a,b; 

@b,a; 

The  ground  information  set  is  still  consistent,  so  this  is 
acceptable.  The  clause  = ( a , b ) ;  would  now  evaluate  true. 

Now  the  clause  @=(a,b);  translates  to 

@a,b;  ground  information 

@b , a ;  "  » 

@a,b;  from  the  clause  @=(a,b) 

@b , a ;  "  "  "  " 

and  this  is  a  consistent  set,  so  that  @=(a,b);  evaluates  to  false. 


2.2 


Page  22 


We  are  left  with  the  problem  of  determining,  when  a  clause 
evaluates  to  false  in  M,  if  it  is  uniformly  false.  A  clause  is 
uniformly  false  in  M  if  the  set  of  statements  in  LM  representing 
the  negation  of  the  clause  are  not  only  consistent,  but  are  in  fact 
consistent  in  all  interpretations  which  satisfy  the  ground  case 
information  and  the  information  implicit  in  the  algorithm  for 
testing  consistency.  As  an  example  consider  the  evaluation  of 
clause  GAX.4  as  done  in  section  2.1  .  The  SC  clause  set  obtained 
there  was  consistent,  and  thus  GAX.4  was  evaluated  to  false. 
However,  that  SC  clause  set  is  a  true  set  of  statements  only  when 
the  sentential  letter  "x"  is  assumed  false.  If  ,,x"  is  assumed 
true,  then  the  SC  clause  set  is  untrue.  Thus  GAX.4  is  not 
uniformly  false  in  this  model.  Furthermore  there  is  no  ground  case 
information  that  can  be  added  to  this  model  which  will  make  GAX.4 
uniformly  true.  We  note  here  that  this  does  not  seem  to  be  an 
unresolvable  difficulty  with  A-models.  We  do  not  develop  potential 
solutions  to  this  difficulty  in  this  report. 

The  above  explanation  is  in  no  way  yet  sufficiently  formalized 
so  that  it  is  possible  to  prove  that  it  is  a  consistent  way  of 
viewing  A-models.  Such  a  formalization  has  not  yet  been  attempted. 
However,  this  type  of  orientation  toward  A-models  does  seem  to  have 
some  utility,  since  there  are  other  types  of  A-models  which  use 
different  internal  languages  and  different  processing  algorithms, 
but  are  still  quite  analogous  (with  respect  to  the  topics  of  this 
section)  to  the  SC  A-models.  One  of  these  models  is  the 
simultaneous  linear  equation  model  (SLE).  This  model  translates 
literals  into  equations  involving  the  arguments  of  the  literal. 


,-T  > 


e  23 


Thus  a  clause  would  be  translated  into  a  set  of  simultaneous 
linear  equations,  and  the  processing  algorithm  is  a  decision 
procedure  testing  the  set  for  consistency.  To  see  that  the  SLE 
model  acts  as  a  theorem  proving  system  analogous  to  the  SC  model, 
one  need  merely  consider  a  set  of  equations  to  be  a  set  of  unit 
clauses,  each  using  the  two  place  equality  predicate  (negated  for 
inequalities),  with  terms  built  out  of  constant  symbols, 
existentially  quantified  variables,  and  function  symbols 
corr espond ing  to  the  usual  arithmetic  operations.  In  addition 
there  would  be  other  clauses  expressing  the  rules  of  algebraic 
manipulations.  In  an  actual  implementation  it  would  probably  not 
be  efficient  to  do  consistency  checking  in  the  SLE  model  as  a 
theorem  proving  search,  but  it  is  reasonable  to  consider  it  that 
way  conceptually.  Section  3.2  of  this  report  is  an  example  of  an 
HLR  search  using  a  SLE  model. 

Other  possible  A-models  are  the  decidable  parts  of  Euclidean 
geometry  or  of  real  analysis.  Our  viewpoint  of  models  is  not  meant 
to  be  restricted  to  what  are  typically  thought  of  as  mathematical 
models.  The  discussion  of  models  will  however  be  limited  to  the  SC 
and  SLE  models  in  this  report,  since  these  two  models  will  be 
familiar  to  the  reader,  and  are  quite  sufficient  for  illustrative 
purposes . 

We  leave  now  the  discussion  of  A-models.  In  the  remainder  of 


this  report  the  word  "model" 
single  HI. 


may  be  thought  of  as  signifying  a 


2.3  A  Defect  in_  The  Model  Strategy 

TMS  evaluates  the  truth  values  of  clauses  when  they  are 
generated,  giving  them  a  single  truth  value  for  all  of  the 
remainder  of  the  search.  But  often  a  clause  that  has  been 
determined  to  be  false  and  gets  resolved  with  a  true  clause, 
actually  doesn't  qualify  as  a  false  clause  because  the  unification 
operation  on  the  false  clause  eliminates  all  of  its  false 
instances.  An  example  of  this  is  the  false  clause 

P(x, 1 , y )  ,  < ( x  ,  y ) ; 
and  the  true  clause 


@P(a,  1 , b ) ; 

where  the  model  is  taken  to  be 

P(x,y,z)  =>  x  times  y  is  z 
<(x,y)  =>  x  is  less  than  y 
a  =>  10 
b  =>  20 

1  =>  identity  element  for  multiplication 


and  the  domain  of  individuals  is  that  of  the  postive  integers. 


I 


Page  25 


The  resolvent  of  these  two  clauses  is 


< (a  ,b) : 


which  is  true  in  the  above  model 


The  problem  is  not  that  the  resolvent  is  true.  There  is  no 
way  in  TMS  to  maintain  completeness  if  true  resolvents  are 
forbidden.  Rather,  the  problem  is  that  the  resolvent  has  a  literal 
in  it  which  came  from  the  false  parent,  but  which  now  has  no  false 
instances.  It  is  clear  (refer  to  section  1.5  for  the  proof  of  TMS 
using  LR)  by  looking  at  the  ground  case  proof  that  must  exist,  that 
such  an  occurrence  of  loss  of  false  instances  can  be  forbidden  in  a 
TMS  search  without  a  loss  of  completeness.  Further  reflection  on 
the  form  of  the  clauses  and  the  resolutions  which  appear  in  the 
ground  proof  trees  involved  in  the  proof  of  TMS  leads  to  a  complete 
refinement  strategy  for  unrestricted  resolution  which  combines  LR 
and  semantic  considerations  stronger  than  TMS.  The  next  chapter  is 
a  presentation  of  such  a  strategy. 


3.0  -  3.1 


Page  26 


3.0  Hereditary-Lock  Resolution 

This  chapter  presents  the  main  result  of  this  report.  This 
result  is  the  development  of  a  sound  and  complete  refinement 
strategy  for  resolution,  called  Hereditary-Lock  Resolution 
( HL-Resolution ,  or  HLR).  This  strategy  combines  a  semantic 
refinement  (TMS)  and  a  syntactic  refinement  (LR)  in  such  a  manner 
that  the  refinement  restricts  the  way  that  a  clause  may  be  used 
based  upon  the  way  that  clause  was  generated  in  the  search. 

Sections  3.1,  3.2  and  3.3  are  an  informal  introduction  to  the 
strategy,  including  a  hand  worked  example.  Section  3.4  is  a  formal 
definition  of  the  HLR  refinement.  Section  3.5  is  the  statement  and 
proof  of  the  soundness  and  completeness  theorem  for  HL-Resolution. 
Sections  3.6  and  3.7  give  some  perspectives  on  the  utility  of  the 
strategy  and  indicate  some  areas  that  require  further 
investigation . 


3.1  HL-Resolution :  An  Informal  Description 


This  section  states  what  HL-Resolution  is 
manner.  The  purpose  here  is  neither  to  give 
definition  nor  to  consider  implementation  details 
give  the  reader  an  idea  of  the  overall  structure  of 


in  an  informal 
an  exact  formal 
but  rather  to 
the  strategy. 


Let  M  be  a  Herbrand  interpretation  and  L  a  set  of  literals.  A 
grounding  substitution  for  L  is  a  substitution  which  when  applied 
to  every  literal  in  L  converts  the  literal  to  a  ground  instance.  M 


I 

3. 1  Page  27 

is  said  to  satisfy  a  set  of  literals,  L,  iff  for  all  grounding 
substitutions,  SIGMA,  there  exists  a  k  such  that  ke  L(SIGMA)  and  k 
is  true  in  M.  If  L  is  a  clause  and  M  satisfies  L  (i.e.  L  viewed 
as  a  set  of  literals)  ,  then  we  say  that  L  is  true  in  M. 

An  HL-clause,  C,  is  an  ordered  pair,  C  =  <SD(C) ,FSL(C)> ,  with 
each  element  being  a  set  of  HL-literals.  SD(C)  is  called  the  set 
of  standard  literals  of  C,  and  FSL(C)  is  called  the  (set  of)  false 
substitution  literals  of  C. 

The  standard  literals  correspond  to  the  usual  literals  in  a 
clause  in  normal  resolution.  The  FSL  literals  constitute  a 
restriction  on  the  ground  instances  which  a  clause  represents.  An 
HL-clause  represents  all  ground  instances  of  its  standard  literals 
in  which  the  grounding  substitution,  THETA,  is  a  grounding 
substitution  for  both  the  standard  and  FSL  literal  sets,  and  such 
that  for  every  literal,  L,  in  the  FSL  set,  L(THETA)  is  not  true  in 
M. 

Let  C  be  an  HL-clause.  Let  RHO  be  the  set  union  over  all 
variables  appearing  anywhere  in  C  and  all  literals  appearing 
anywhere  in  C.  Let  MU  be  the  set  of  variables  appearing  in  the 
standard  literals  of  C.  We  execute  iteratively  the  following  two 
steps  until  no  further  change  occurs  to  MU: 

1.  Move  all  literals  in  RHO  containing  a  variable  in  MU  into 

MU. 

2.  Move  all  variables  in  RHO  which  are  contained  in  a  literal 


in  MU  into  MU. 


3.1 


Page  2 8 

We  define  a  variable  or  literal  to  be  influential  in  C  iff  it 
is  in  MU.  If  L  is  a  literal  in  the  FSL  set  of  the  clause  C,  and  if 
L  is  not  influential  in  C,  then  L  may  be  deleted  from  the  FSL  of  C. 

Let  L  be  a  literal,  then  by  @L  is  meant  the  same  literal  but 
with  the  opposite  sign,  i.e.  @L  contains  a  negation  sign  iff  L 
does  not  contain  a  negation  sign. 

Every  standard  literal  has  associated  with  it  two  numbers. 
One  is  called  the  true  lock  number,  and  the  other  is  called  the 
false  lock  number.  It  is  through  the  use  of  these  lock  numbers 
that  HL-Resolution  incorporates  LR  as  part  of  its  refinement.  To 
understand  why  each  literal  must  have  more  than  one  lock  number 
refer  back  to  section  1.5,  where  TMS  was  proved  complete  by  an 
argument  based  on  LR.  There  it  is  seen  that  in  the  ground  case  it 
is  necessary  to  have  all  the  lock  numbers  of  true  literals  smaller 
than  all  the  lock  numbers  of  false  literals,  in  order  to  make  the 
LR  search  automatically  satisfy  TMS  restriction  of  having  a  false 
parent  in  every  resolution  step.  When  searching  at  the  general 
level,-  then,  it  is  necessary  to  have  a  mechanism  which  can 
accomplish  the  same  effect  as  assigning  small  lock  numbers  to  true 
literals  and  large  lock  numbers  to  false  literals.  The  problem  at 
the  general  level  is  that  a  given  clause  may  really  need  to  be  used 
to  represent  several  distinct  ground  instances,  and  the  truth  value 
of  any  particular  general  level  literal  might  be  different  in  the 
different  ground  instances.  HL-Resolution  searches  are  concerned 
with  keeping  track  of  the  use  of  clauses  with  respect  to  which 
ground  instances  are  the  ones  intended  to  be  represented  by  the 


Page  29 


clause.  This  information  is  implicitly  held,  in  the  FSL  of  the 
clause,  at  the  level  of  specifying  a  set  of  literals  that  must  be 
false.  Completeness  of  the  HL-search  is  maintained  through  the  use 
of  the  double  lock  numbers  on  literals,  the  true  (false)  lock 
number  being  used  if  a  literal  is  supposed  to  stand  for  true 
(false)  instances  of  itself.  Exactly  how  this  is  accomplished 
should  be  apparent  later  when  some  examples  are  given.  Most  of  the 
complications  arise,  in  explaining  the  HL-Resolution  method,  in 
treating  standard  literals  for  which  the  FSL  does  not  contain 
enough  information  to  restrict  them  to  necessarily  true  or 
necessarily  false  instances. 

The  input  clauses  for  HL-Resolution  are  similar  to  input 
clauses  in  normal  resolution,  except  that: 

1.  each  clause  has  an  FSL  part,  and  it  is  empty; 

2.  true  and  false  lock  numbers,  all  distinct,  have  been 
assigned  to  all  the  literals  (and  all  of  the  true  lock  numbers 
are  smaller  than  all  of  the  false  lock  numbers). 

Each  input  clause,  and  during  the  search  every  clause 
generated,  is  evaluated  according  to  M,  and  marked  "T"  if  it  only 
has  true  ground  instances,  "F"  if  it  only  has  false  ground 
instances,  or  "T/F"  if  it  has  both  true  and  false  ground  instances. 
Note  that  this  evaluation  is  relative  to  the  FSL  restriction,  i.e. 
we  only  consider  grounding  substitutions  that  make  every  FSL 
literal  false.  Section  3.2  contains  an  example  of  how  a  HL-clause 
can  be  evaluated  relative  to  its  FSL  restrictions.  We  may  view  the 
search  process  as  a  breadth  first  binary  resolution  search 


3.1 


Page  30 


(factoring  will  be  discussed  later)  with  several  refinements  and 
one  extension. 

The  extension  is  that  there  is  an  asymmetry  in  the  role  played 
by  the  two  parents  in  a  resolution  step.  Thus  given  two  clauses, 
Cl  and  C2,  we  consider  two  distinct  resolution  possibilities,  and 
express  this  as  resolving  Cl  against  C2,  and  C2  against  Cl.  This 
leads  to  the  first  refinement  condition  which  is  that  for  Cl  to 
resolve  against  C2,  it  must  be  the  case  that  Cl  is  false  in  M. 
Thus  Cl  may  be  either  "F"  or  "T/F"  in  its  evaluation  according  to 
M.  Furthermore,  after  choosing  the  two  literals  on  which  to  unify 
in  the  resolution  of  Cl  against  C2,  Cl  must  still  be  false  after 
the  most  general  unifier  (mgu)  is  applied  to  Cl.  Note  that  the  mgu 
is  also  applied  to  the  FSL  literals.  The  purpose  of  this  is  simply 
to  make  sure  that  a  false  clause  in  a  resolution  actually 
contributes  to  the  resolvent  produced  at  least  one  of  its  false 
instances . 

It  is  not  the  case  however,  that  the  search  would  be  complete 
if  false  clauses  were  allowed  to  contribute  none  of  their  true 
instances  to  the  search  space.  Thus  if  Cl  and  C2  are  both  "T/F"  in 
M,  then  we  would  try  to  resolve  Cl  against  C2,  producing  the 
resolvent  R,  making  sure  Cl  can  pass  on  only  false  literal 
instances  to  R.  In  this  case  C2  acts  as  a  true  clause  since  at  the 
very  least,  the  literal  of  C2  resolved  on  is  representing  true 
ground  instances.  We  then  try  the  other  possibility,  namely 
resolving  C2  against  Cl,  producing  R',  in  which  we  make  sure  tl^ 
literals  in  R'  received  from  C2  have  false  ground  instances,  and 


3.1 


Page  31 


here  Cl  acts  as  the  true  clause. 

Thus  our  first  refinement  condition  states  that  if  Cl  and  C2 
are  both  "T"  we  do  not  resolve  them  together  at  all.  If  one  has 
true  instances  and  the  other  has  false  instances  (but  not  both  are 
"T/F" ) ,  then  we  try  to  resolve  the  false  one  against  the  true  one. 
If  both  Cl  and  C2  are  ''T/F",  then  we  resolve  Cl  against  C2  and  also 
C2  against  Cl .  In  each  case  the  resolution  of  C  against  D  is 
blocked  if,  after  unification,  C  has  no  false  instances  left.  This 
first  refinement  is  clearly  just  a  strengthening  of  TMS,  and  is 
sensitive  to  the  structure  of  the  model  which  is  used  for  the  truth 
evaluations  of  the  clauses. 

Now  we  consider  the  second  refinement,  which  is  syntactic  in 
nature  and  is  an  elaboration  of  the  Lock  Resolution  strategy. 

When  we  resolve  Cl  against  C2,  we  know  that  Cl  must  stand  only 
for  those  ground  instances  in  which  every  one  of  its  literals  is 
false.  What  we  then  do  is  say  that  the  literal  in  Cl  on  which  we 
are  allowed  to  resolve  is  constrained  to  be  a  literal  of  lowest 
false  lock  number  in  Cl.  We  also  see  that  after  unification,  the 
corresponding  literal  on  which  C2  is  being  resolved  must  be  true 
for  all  the  ground  instances  that  C2  represents  in  this  resolution 
step.  From  this  we  can  require  that  every  literal  in  C2  with  a 
smaller  true  lock  number  than  the  literal  in  C2  on  which  we 
resolve,  must  represent  ground  instances  which  are  false.  Thus  we 


see  rather  stringent  conditions  being  statable  about  what  must  hold 
before  a  resolvent  can  be  produced.  The  single  most  important 
feature  of  HL-Resolution  search  organization  is  the  movement  of 


3.1 


Page  32 


these  conditions  in  a  simple  form  into  the  FSL  portion  of  the 
resolvent  being  produced.  This  is  how  we  organize  the  individual 
resolution  steps  so  that  we  can  keep  track  of  what  is  going  on  with 
respect  to  the  allowed  substitution  instances  for  the  literals  in 
the  resolvent  produced. 

As  an  example  of  how  this  is  done  in  a  typical  resolution 
step,  consider  the  two  HL-clauses,  Cl  and  C2,  where  Cl  has  at  least 
some  false  instances,  and  C2  has  at  least  some  true  instances,  and 
we  wish  to  resolve  Cl  against  C2.  We  write  the  HL-clauses  as  a  set 
of  standard  literals,  terminated  by  a  semicolon,  and  followed  by 
the  set  of  FSL  literals.  Each  standard  literal  is  written  as  an 


ordered 

triple,  in  which 

the 

first 

component  is 

the 

true  lock 

number , 

the  third  is 

the 

false 

lock  number, 

and 

the  second 

component  is  a  normal  literal. 

Thus  we  have 

Cl  =  < 1 , A( x ) , 50> ,  <9 , B( f ( y ) ) , 6 1 > ,  <2,G(c),57>;  FSL  =  (L1,L2) 

C2  =  <10,@A(u) ,46>,  < 1 5 , E(u , f ( u) ) ,41 > ,  < 1 7 , §A( g( v ) ) , 38> ; 

FSL  =  (L3,L4) 

where  LI,  L2,  L3  and  L4  represent  some  literals  that  have  been  put 

into  the  FSL's  due  to  previous  resolution  steps  in  the  derivation 

trees  for  Cl  and  C2.  We  assume  that  these  literals  are  not  the 

same  as  any  of  the  standard  literals  of  Cl  and  C2  for  the  purposes 

of  this  example. 


Now  to  resolve  Cl  against  C 2 ,  Cl  must  stand  for  ground 
instances  in  which  every  one  of  its  literals  is  false,  and  it  will 
be  allowed  to  resolve  only  on  its  literal  of  lowest  false  lock 
number,  which  in  this  case  is  its  first  literal.  We  first  try  to 
resolve  on  C2  on  the  literal  of  lowest  true  lock  number,  again  in 
this  case  its  first  literal.  There  is  a  mgu  for  these  two 
literals,  nu  =(x/u),  and  we  form  the  resolvent  R 1 : 

R1  =  <15,E(x,f(x) ) ,41>,<17,@A(g( v) ) ,38>,<9,B(f(y) ) ,6l>,<2,G(c) ,57>; 
FSL  =  (L1(nu),L2(nu),L3(nu),L4(nu),A(x),B(f(y)),G(c)) 

Notice  that  the  FSL  set  consists  of  everything  we  know  must  be 
false,  namely  those  things  already  in  the  FSL's  of  Cl  and  C2,  plus 
all  of  the  literals  from  the  false  parent,  Cl. 

There  is  another  resolvent  that  can  be  produced  from  Cl 
against  C2,  by  using  the  third  literal  in  C2.  Thus  we  produce  R2, 
using  the  mgu  mu  =  (g(v)/x): 

R2  =  < 1 0 ,@A( u)  ,  46>  ,  <15,E(u,f(u)) , 41 > ,  <9 , B( f ( y ) ) , 61 > , <2 , G ( c ) , 57> ; 
FSL  =  (LI (mu) ,L2(mu) ,L3(mu) ,L4(mu) , 

A(g(v) ) ,B(f(y) ) ,G(c) ,@A(u) , E ( u , f ( u ) ) ) 


Here  we  notice  that  since  the  first  and  second  literals  of  the 
true  clause,  C2,  had  lower  true  lock  numbers  than  the  true  literal 
selected  for  resolution  from  C2,  namely  the  third  literal  of  C2, 
those  first  two  literals  of  C2  must  represent  false  ground 
instances.  This  is  why  they  appear  in  the  FSL  of  R2. 


3.1 


Page  3^ 


jB 

»•  ■ 

• 

i 


ti 


Thus  we  see  a  strong  interaction  between  the  Lock  Resolution 
literal  ordering  and  the  allowed  substitution  instances  that  a 
resolvent  can  have.  This  interaction  is  mediated  through  the 
action  of  the  model.  This  brings  us  to  the  last  refinement 
condition  for  HL-Resolution  steps,  namely  that  the  set  of  ground 
instances  that  a  clause  stands  for  must  be  non-empty.  In  the  case 
of  R1  and  R2  above,  we  would  actually  have  to  check  within  the 
model,  M,  to  see  if  the  FSL’s  had  any  substitution  instances  that 
simultaneously  makes  all  of  the  FSL  literals  false  ground  literals. 
This  is  simply  done  by  considering  the  FSL  literals  as  if  they  were 
the  literals  of  an  ordinary  clause  (i.e.  "OR-ed"  together),  and 
submitting  them  to  the  model  for  evaluation  just  like  a  normal 
clause  would  be  evaluated.  If  the  model  returns  the  answer 
"false”,  then  the  FSL  set  has  a  falsifying  substitution,  and  the 
corresponding  clause  exists.  If  the  answer  is  "true",  then  the 
clause  has  no  allowed  instances  and  can  be  deleted. 

We  make  the  following  comments  in  general  on  the  HL-Resolution 
search,  as  it  has  been  presented  up  to  this  point: 

1.  When  the  parent  clauses  have  non-empty  FSL's  then  the  mgu 
is  applied  to  both  FSL's  and  all  their  literals  go  into  the 
FSL  of  the  resolvent,  along  with  any  other  new  literals  from: 

a)  all  the  standard  literals  of  the  false  parent, 
including  the  literal  resolved  upon; 

b)  any  literals  among  the  standard  literals  of  the  true 
parent  which  have  true  lock  numbers  lower  than  the  true 
lock  number  of  the  literal  resolved  upon  in  the  true 


parent . 


3.1 


Page  35 


2.  There  is  usually  only  one  possible  literal  for  resolution 
in  the  false  parent. 

3.  Often  there  will  be  several  literals  to  resolve  upon  in 
the  true  parent. 

4.  The  input  clauses  start  out  with  no  literals  in  their  FSL 
lists. 

5.  As  clauses  deeper  in  the  search  are  formed,  the  FSL's  tend 
to  grow  and  become  more  restrictive. 

6.  Some  resolvents  are  blocked  due  to  having  empty  sets  of 
ground  instances. 

7.  Literals  which  contain  only  ground  terms  can  be  eliminated 
from  an  FSL  if  they  are  false;  if  they  are  true  then  the 
whole  clause  can  be  eliminated. 

8.  If  a  literal  is  no  longer  influential  in  a  clause,  then  it 
may  be  deleted  from  the  FSL  if  it  has  false  instances;  if  it 
has  only  true  instances,  then  the  clause  is  deleted. 

Factoring  in  HLR  is  somewhat  different  than  in  unrestricted 
resolution.  For  simplicity  in  explanation  we  will  treat  factoring 
as  explicit  factoring.  We  approach  the  factoring  issue  by  first 
stating  that  any  factoring  can  be  thought  of  as  a  sequence  of 
elementary  factoring  steps,  where  an  elementary  factoring  step  is 
defined  to  be  a  factoring  of  a  clause  where  the  set  of  literals  on 
which  we  factor  contains  just  two  literals.  We  also  note  that  in 
HLR,  as  in  other  binary  resolution  refinement  strategies,  it  is 
only  necessary  to  produce  factors  which  unify  literals  with  the 
literal  which  will  be  next  resolved  away  in  a  resolution  step. 
Thus  in  HLR  we  identify  the  next  literal  to  be  resolved  on,  and 


perform  elementary  factoring  steps,  yielding  a  set  of  factors. 

These  factors,  and  any  further  factors  produced  from  them  by 
elementary  factoring  steps  on  the  same  literal,  constitute  the  set 
of  factors  of  a  clause  that  must  be  produced  in  HLR.  We  do  not 
here  give  a  full  description  of  the  factoring  process,  but  offer 
the  following  example  to  illustrate  the  nature  of  the 
considerations  that  go  into  the  factoring  step.  Suppose  we  wish  to 
factor  the  clause  C: 

C  =  < 1 , A( x )  , 50 > ,  <2, A(f(y) ) ,49>,  < 1 1 , B( a) , 4 1 > ;  FSL  =  (  ) 

which  can  be  syntactically  factored  on  the  second  components  of  its 
first  and  second  literals  by  using  the  mgu  f(y)/x.  As  in  LR,  we 
delete  the  literal  with  the  higher  lock  number,  i.e.  we  "factor 
low".  The  problem  is  that  the  first  two  literals  of  C  have  no 
unique  low  number,  since  we  do  not  know  whether  to  use  the  true  or 
false  lock  numbers  of  these  literals.  Therefore  we  tentatively 
produce  the  following  two  elementary  factors  of  C: 

FI  =  <1 , A(f(y) ) ,50>,  <1 1 , B( a) ,41 >;  FSL  =  (@A(f(y))) 

F2  =  <2,A(f(y)),49>,  < 1 1 , B( a ) , 4 1 > ;  FSL  =  ( A( f C y) ) ) 

We  see  that  FI  stands  for  ground  instances  in  which  A(f(y))  is 
true,  and  F2  for  ground  instances  in  which  A(f(y))  is  false.  Now, 
having  done  this,  we  can  see  that  F2  is  not  a  legitimate  elementary 
factor.  The  reason  is  that  since  A(f(y))  is  false  in  F2,  the  first 
literal  in  F2  must  use  its  false  lock  number.  But  then  the  second 
literal  of  F2  will  be  the  one  resolved  away  next.  This  means  we 
have  factored  C  on  a  literal  other  than  the  one  to  be  next  resolved 


away,  and  this  is  unnecessary  in  HLR.  Thus  the  only  factor  of  C  is 
FI.  Notice  that  it  is  necessary  to  check  the  FSL  of  FI  according 
to  the  model  to  see  that  it  is  indeed  possible  to  have  @A(f(y)) 
evaluate  false.  If  the  model  says  that  §A(f(y))  is  true,  then  FI 
will  be  deleted  also. 

Our  explanation  of  HL-Resolution  thus  far  has  been  informal  in 
nature,  but  has  covered  enough  of  the  relevant  features  of  the 
strategy  so  that  we  are  now  in  a  position  to  see  and  understand  an 
actual  example  of  an  HL-search  space.  This  is  done  in  the  next 


section . 


3.2 


Page  38 


3.2  An  HL- Re  solution  Example 

This  section  works  through  an  example  theorem  taken  from  Chang 
and  Lee,  p.  302,  (Chang  and  Lee,  1973)  and  is  stated  as: 

"If  S  is  a  nonempty  subset  of  a  group  such  that  if  x,  y  belong 
to  S  then  x  *  INVERSE(y)  belongs  to  S,  then  S  contains  INVERSE(x) 
whenever  it  contains  x." 

In  order  to  have  a  search  space  of  manageable  size,  for  a  hand 
worked  example,  we  leave  out  the  associativity  axioms  for  the 
group.  The  true  and  false  lock  numbers  are  shown  by  writing 
literals  as  ordered  triples,  with  the  first  component  being  the 
true  lock  number.  We  also  write  after  each  clause,  "T" ,  "F"  or 
"T/F"  to  indicate  the  following. 

1.  "T"  if  the  clause  has  only  true  instances. 

2.  "F"  if  the  clause  has  only  false  instances. 

3.  "T/F"  if  the  clause  has  both  true  and  false  instances. 

The  model  scheme,  M,  used  for  these  truth  evaluations  for  this 
example  will  be  that  of  simultaneous  linear  equations  (SLE), 
including  inequalities.  The  following  correspondence  holds  between 
the  language  of  the  clause  set,  LC,  and  the  language  of  the  model, 
LSLE : 


P  (  x  ,  y  ,  z ) 


I(x) 

1 

B 


=>  x+y-z=0 

=  >  -x 

=  >  0 

=  >  B 

=  > 


S(x) 


x  >  B 


3.2 


Page  39 


where  the  constant  B  in  LC  is  a  Skolem  function  of  no  arguments 
introduced  by  negating  the  theorem  and  transforming  to  CNF,  and  the 
unary  function  I(x)  is  an  abbreviation  for  INVERSE(x). 

The  domain  of  individuals  for  M  will  be  taken  as  real  numbers, 
and  the  symbols  in  LS^E  have  their  usual  meaning  in  real  analysis. 
There  is  one  piece  of  ground  case  information  in  M,  which  is  that 
B  >  0.  We  will  not  discuss  how  this  model  was  obtained  here. 

The  clauses  for  this  theorem  are: 


1 . 

< 1 0 , P ( 1 ,x  ,x) , 1000>; 

FSL 

= 

(  ) 

T 

2. 

<9 , P ( x , 1 , x ) ,900>; 

FSL 

= 

(  ) 

T 

3. 

<8 , P(x , I ( x) ,1 ) ,800>; 

FSL 

= 

(  ) 

T 

4. 

<7 , P(I (x) ,x , 1 ) ,700>; 

FSL 

s 

(  ) 

T 

5. 

<6 , S( B) , 600> ; 

FSL 

= 

(  ) 

T 

6. 

<5 , @S( I ( B) ) , 500> ; 

FSL 

= 

(  ) 

T 

7. 

<4,@S(x) , 100>,<3,@S(y) , 200> 

,<1 

,@P ( x  ,  I ( y ) ,  z ) , 300> , <2 , S( z) ,400>; 

FSL  =  (  )  T/F 

We  will  perform  a  breadth  first  search  using  HL-Resolution 
with  the  SLE  model  given  above,  and  the  indicated  lock  numbering. 
No  factoring  will  be  performed  in  order  to  keep  this  example  as 
simple  as  possible.  The  notation  for  following  the  search  is  that 

n  Tn  x  m  Tm  =  r 

means  that  clause  number  n,  used  for  its  Tn  (i.e.  true  or  false) 
instances  resolves  with  clause  number  m  using  its  Tm  instances, 
resulting  in  the  resolvent  numbered  r  . 


3.2 


Page  40 


There  is  only  one  clause  that  can  be  produced  at  level  1. 
This  is  because  there  exists  only  one  clause  with  false  instances 
(clause  7),  and  we  must  resolve  it  on  its  lowest  false  lock 
numbered  literal,  and  there  is  only  one  other  literal  in  the  clause 
set  at  this  time  that  can  possibly  unify  with  it  (the  literal  in 
clause  5).  Thus  we  produce  the  only  clause  at  level  1,  namely 

5Tx7F  =  8.  <3,@S(y) ,200>,  <1 ,@P(B, I(y) ,z) ,300>,  <2 , S( z) , 400> ; 

FSL  =  (§S(y) ,@P ( B, I (y ) ,z) ,S(z))  F 

Notice  that  @S(B)  is  not  included  in  the  FSL  of  clause  8  since 
it  is  a  ground  literal  and  is  false  in  M.  Also  notice  that  clause 
8  has  only  false  instances  in  M,  as  will  always  be  the  case  in 
HL-Resolution  when  one  of  the  parents  is  a  true  unit  clause. 

At  level  2  of  the  search  there  are  two  resolvents  produced. 

5Tx8F  =  9.  <1 ,§P(B,I(B) fz) ,300>,  <2 , S( z) , 400> ; 

FSL  =  (@P(B,I(B) ,z) , S( z) )  F 

7Tx8F  =  10.  <4 , @S( x) , 1 00> ,  <3 , @S( y ' ) , 200> ,  <1 ,§P ( x , I ( y ' ) ,y) , 300 > , 
<1 ,@P(B,I(y) ,z) ,300>,  <2, S(z) ,400>; 

FSL  =  ( @S( y) ,@P(B,I(y) ,z) fS(z) ,@P ( x , I ( y ' ) ,y))  T/F 

Now  we  start  producing  the  clauses  at  level  3. 

3Tx9F  =  11.  <2, S(1 ) ,400>;  FSL  =  (  )  F 

We  notice  that  5Tx10F  produces  no  resolvents  since  the  only 
allowed  unification  according  to  the  lock  numbering,  namely  letting 
x  in  clause  10  unify  to  B,  results  in  a  resolvent  whose  FSL  set 


3.2 


Page  41 


becomes 

FSL  =  (@S(B),@S(y'),@P(B,I(y*),y),@P(B,I(y),z),S(z),@S(y)) 

and  the  model  M  cannot  make  all  of  these  literals  false  at  the  same 
time,  i.e.  the  resolvent  would  not  represent  any  ground  instances. 

It,  turns  out  that  we  have  completed  level  3  already,  since  no 
more  pairs  of  clauses,  both  at  level  2  or  below,  can  resolve.  Thus 
level  3  consists  of  clause  11.  We  now  start  producing  level  4. 

7Tx 1 1F  =  12.  <4 ,@S( x) , 1 00> ,  <1 ,§P(x,I(1),z> ,300>,  <2 , S( z) , 400> ; 

FSL  =  (§P(x,I(1 ) ,z) ,S(z) )  T 

7Tx1 IF  =  13.  <3,@S(y) ,200>,  <1 ,  @P  ( 1 ,I(y) ,z) , 300 > ,  <2 , S( z) , 400> ; 

FSL  =  (§P(1,I(y) ,z),S(z) ,§S(y))  F 

IOTxIIF  =  14.  <4 , @S( x ) , 1 00> ,  < 1 ,@P ( x , I ( 1 ) , y ) , 300> , 

<1 ,@P(B,I(y) ,z) ,300>,  <2 , S( z) ,400>; 

FSL  =  (§S(y) ,@P ( B, I ( y) ,z) ,S(z) ,§P ( x  ,  I ( 1 )  ,y))  F 

IOTxIIF  =  15.  <3 ,@S(y' ) , 200> ,  <1 ,§P( 1 , I(y» ) ,y) , 300>, 

<1 ,§P(B,I(y) ,z) ,300>,  <2 , S( z) ,400>; 

FSL  =  (gS(y),gP(B,I(y),z),S(z),§P(1,I(y*),y>)  T/F 

This  concludes  level  4  of  the  search. 

Notice  that  both  clauses  14  and  15  contain  the  literal  @S(y) 
in  their  FSL  sets,  and  this  literal  came  from  the  FSL  set  of  clause 
10.  This  literal  is  one  of  the  reasons  why  clause  14  only  has 
false  instances.  Without  that  literal  clause  14  would  be  classed 
"T/F".  It  is  worthwhile  here  to  interrupt  the  search  briefly  to 


3.2 


Page  42 


show  what  the  model  evaluation  is  like  for  clause  14,  for  our 
chosen  model  M. 

We  write  the  FSL  and  standard  literals,  each  negated,  in  the 
language  of  the  model,  for  clause  14.  This  gives  8  conditions,  one 
for  each  literal. 

1  .  x  >  B 

2.  x-0=y 

3.  B-y=  z 

4.  z  <  B 

5.  y  >  B 

6.  B-y=z 

7.  z  <  B 

8.  x-0  =  y 

This  is  a  consistent  set  of  conditions,  so  M  returns  false  as 
its  evaluation.  In  order  to  show  that  there  are  no  true  instances 
requires  more  work  however.  We  see  that  the  2nd,  3rd  and  4th 
standard  literals  of  clause  14  can  never  be  true,  since  they  are 
represented  in  the  FSL  set.  But  we  are  not  sure  about  the  first 
literal.  To  make  a  decision  we  translate  the  negations  of  the  FSL 
literals  and  the  first  literal  (directly)  into  the  language  of  the 
model.  This  gives 

1  .  x  <  B 


2.  y  >  B 

3.  B-y=  z 


from  @S(x) 
from  @S(y) 
from  @P(B, I ( y ) ,z) 


3.2 


Page  43 


4.  z  <  B 

from 

S(z) 

>> 

II 

O 

X 

LA 

from 

@P(x,  HI )  ,y) 

where  the  first  condition  is  the  direct  translation  of  the  first 
literal  in  clause  14,  and  the  rest  are  translations  of  the 
negations  of  the  clause  14  FSL  literals,  as  shown  on  the  right.  We 
see  that  the  5th  condition  says  x=y,  and  this  causes  the  first  two 
conditions  to  be  contradictory.  From  this  we  conclude  that  the  set 
of  substitutions  that  make  all  of  the  FSL  literals  simultaneously 
false  are  disjoint  from  the  set  of  substitutions  that  make  @S(x) 
true.  Therefore  the  set  of  ground  instances  which  clause  14  stands 
for  all  have  their  first  literal  (i.e.  @S(x),  grounded)  false.  We 
already  knew  that  the  other  literals  must  be  false  since  they  were 
exactly  represented  in  the  FSL  of  clause  14.  Thus  clause  14 
represents  only  false  ground  instances. 

The  above  explanation  of  how  the  SLE  model  works  is  on  an 
intuitive  level.  For  automatic  theorem  proving  this  model  must  be 
implemented  as  a  decision  procedure  for  the  class  of  expressions  we 
are  interested  in.  As  an  example  of  such  a  decision  procedure  see 
Cooper  (Cooper,  1972).  A  somewhat  different  procedure  for  the  SLE 
model  is  currently  being  implemented  for  experimental  purposes,  but 
it  is  not  yet  certain  if  it  is  actually  a  decision  procedure. 

We  now  pick  up  the  search  process  again  at  the  point  we  left 
off,  starting  level  5. 

1 1 Fx 1 2T  =  16.  <1 ,@P(1 ,1(1 )  ,z) ,300>,  <2 , S( z) , 400> ; 


FSL  =  (§P(1 ,1(1 ) ,z) ,S(z))  F 


3.2 


Page  44 


5Tx13F  =  17.  <1  ,@P(1  ,I(B),z) ,300>,  <2 , S( z) , 400> ; 

FSL  =  (@P(1  ,I(B) ,z) , S ( z ) )  F 


7Tx 1 3F  =  18. 


In  order  to  keep  this  example  as  clean  as  possible  we  will  not 
write  out  the  forms  of  the  clauses,  unless  they  are  involved  in  the 
proof,  or  are  of  other  interest,  from  this  point  onward. 


5Tx14F  =  19. 
7Tx14F  =  20. 
1  1Fx15T  =  21 . 
5Tx 1 5F  =  22. 
7Tx15F  =  23. 


This  concludes  level  5,  and  we  start  level  6. 


1 Tx 1 6F  =  24. 

At  this  point,  clauses  3T  and  18F  resolve  to  give 
<2 ,S( 1 ) , 400> ;  FSL  =  (  )  F 

which  is  a  duplicate  of  clause  10.  In  general  it  could  be 
difficult  to  make  decisions  about  subsumption  possibilities  or  to 
determine  if  two  clauses  which  differ  only  in  their  FSL  sets  are 
equivalent,  without  incurring  a  high  computational  load.  We  would 
expect  however,  that  a  reasonable  implementation  of  HL-Resolution 
would  detect  the  identity  of  this  3Tx18F  resolvent  and  clause  11. 
Thus  we  do  the  same  in  this  example  here,  and  delete  this  clause. 
The  next  clause  to  be  produced  is 


3.2 


Page  45 


1Tx17F  =  25.  <2,S(I(B) ) ,400>;  FSL  =  (  )  F 

Clause  25  is  a  unit,  and  we  assume  that  there  is  a  one  step 
unit  look-ahead  in  the  search  process.  Thus,  since  clause  25 
resolves  with  clause  6 

6Tx25F  =  26.  *BOX«;  FSL  =  (  )  F 

we  have  found  a  proof.  The  FSL  is  an  important  part  of  the  HL-null 
clause.  If  a  null  clause  is  formed  during  an  HL-Resolution  search 
but  the  FSL  requirement  has  no  ground  instances  for  which  the 
clause  stands,  then  a  proof  has  not  been  found  of  the  required  form 
for  HL-Resolution. 

It  should  be  emphasized  that  the  26  clauses  (7  input  and  19 
generated)  constitute  the  complete  breadth  first  search  space 
(without  factoring)  of  HL-Resolution  for  this  problem.  The  null 
clause  appeared  at  level  7,  and  this  was  detected  when  the  second 
clause  at  level  6  was  generated  (i.e.  clause  25). 

This  same  clause  set  was  also  submitted  to  a  breadth  first 
normal  resolution  theorem  prover,  and  a  proof  was  obtained  after 
reaching  a  search  space  size  of  165  clauses.  The  theorem  prover 
used  the  one  step  look  ahead  for  generated  units,  and  a  special 
type  of  factoring  which  produces  fewer  factored  clauses  than  the 
usual  factoring.  The  null  clause  was  obtained  at  level  4,  and  this 
was  seen  after  generating  one  clause  at  level  3,  by  the  look  ahead 
for  units.  The  total  number  of  clauses  at  the  end  of  level  2  was 
163  clauses.  Thus  the  search  was  growing  rapidly  in  the  number  of 


3.2 


Page  46 


clauses  per  level  already  at  level  2.  An  estimate  of  the  number  of 
clauses  contained  in  level  3  is  1300  clauses. 

The  search  was  also  tried  by  the  same  theorem  prover  but 
without  any  factoring  at  all.  This  time  the  proof  was  obtained 
after  generating  224  clauses.  Again,  a  unit  look  ahead  found  the 
proof  at  level  4  before  level  3  was  finished.  This  time  148 
clauses  were  produced  at  level  3  before  the  look  ahead  discovered 
the  proof. 

It  is  seen  that  the  HL-Resolution  search  generates  far  fewer 
clauses  at  any  given  level.  However  the  null  clause  was  3  levels 
deeper  in  HL-Resolution  than  in  unrestricted  resolution.  To  make  a 
more  valid  comparison  between  HL-Resolution  and  unrestricted 
breadth  first,  with  factoring  (BF-FAC)  and  without  (BF),  we  compare 
the  number  of  clauses  generated  at  the  time  a  proof  was  detected, 
and  the  number  of  clauses  contained  in  the  search  space  at  the 
level  two  levels  before  the  proof  level. 

type  of  number  of  clauses  number  of  clauses 

resolution  at  proof  time  level  of  proof  2  levels  before 

the  proof  level 

HLR  26  7  8 ( level  5) 

BF-FAC  165  4  147(level  2) 

BF  224  4  63 ( level  2) 

We  see  that  HL-Resolution  decreased  the  number  of  clauses  by 
about  a  factor  of  eight,  for  this  particular  problem. 


3.3 


Page  47 


3 . 3  Some  Comments  Concerning  HL-Resolution  Searches 
and  Other  Resolution  Searches 

Resolution  developed  rapidly  as  a  theorem  proving  method 
through  the  development  of  a  large  number  of  strategies.  Most  of 
these  strategies  were  syntactically  oriented,  as  was  the  basic 
resolution  procedure.  The  outstanding  feature  of  these  strategies 
was  their  lack  of  uniformity  in  application  effect.  Any  given 
strategy  might  help  the  search  tremendously  on  one  specific 
problem,  and  help  very  little,  or  become  even  detrimental  when  used 
on  a  slightly  different  problem.  Clearly  these  strategies  were  not 
sufficiently  sensitive  to  what  was  happening  in  the  search  space. 

The  two  fundamental  problems  with  resolution  methods  in 
general  are: 

1.  the  lack  of  any  effective  techniques  addressing  themselves 
to  the  Term  Substitution  Problem  (TSP); 

2.  the  restriction  to  local  interaction  in  the  search  space 
(LIP,  for  Local  Interaction  Problem). 

The  term  substitution  problem  refers  to  the  fact  that  in  first 
order  logic  (and  therefore  in  first  order  resolution)  the 
semidecidability  (as  opposed  to  deciability)  has  its  origin  in  the 
inability  to  detect  the  non-existence  of  a  grounding  substitution 
which  makes  a  set  of  clauses  an  unsatisf iable  set  of  ground 
clauses.  This  means  that  the  real  procedural  search,  i.e.,  the 
part  of  the  search  that  cannot  both  be  bounded  in  the  amount  of 
work  performed  and  still  maintain  completeness  of  the  search,  is 


3.3 


Page  48 


the  search  over  the  Herbrand  universe. 

Most  resolution  strategies  have  no  adequate  method  for 
directly  handling  the  TSP.  In  fact,  they  hide  the  problem  by 
prescribing  the  use  of  the  most  general  unifier.  This  mgu  is 
automatically  called  forth,  as  a  result  of  which  clauses,  and  which 
literals  within  these  clauses,  are  chosen  by  the  search  strategies. 
These  strategies  are  typically  insensitive  to  the  TSP,  and  when 
they  are  sensitive  to  it,  they  seem  to  be  sensitive  to  the  wrong 
aspect  of  it.  An  example  of  a  strategy  which  is  insensitive  to  the 
TSP  is  LR.  Strategies  which  prevent  the  formation  of  clauses  which 
exceed  a  given  bound  on  the  depth  of  function  nesting  are  an 
example  of  strategies  that  are  sensitive  to  the  wrong  aspect  of  the 
TSP.  A  fundamental  shortcoming  of  the  use  of  the  mgu  in  resolution 
steps  is  that  unification  is  essentially  a  two  literal  process, 

A 

while  the  resolution  step  involves  all  of  the  literals  in  the 
parent  clauses. 

Even  when  the  proper  terms  appear  in  a  resolution  search,  the 
sentential  connections  necessary  are  only  found  after,  usually, 
many  additional  clauses  with  new  irrelevant,  or  redundant,  terms 
are  generated.  Thus  resolution  tends  to  be  term  intensive  (and 
indiscriminately  so)  rather  than  truth  table  intensive. 

We  identify,  then,  two  deficiencies  in  the  way  resolution 
handles  the  TSP.  The  first  is  the  promiscuous  generation  of 
substitution  instances,  and  the  second  is  the  lack  of  address  in 
following  the  sentential  consequences  of  terms  shortly  after  their 
formation.  These  two  problems  are  very  closely  connected,  but  not 


Page  49 


inseparably  so.  We  will  return  to  these  two  aspects  of  the  TSP 
after  a  brief  discussion  of  the  local  interaction  problem. 

The  local  interaction  problem  (LIP)  is,  for  large  theorem 
proving  problems,  the  single  most  important  factor  causing 
resolution  search  spaces  to  grow  as  fast  as  they  do.  There  are  two 
recognizable  aspects  to  this  problem.  The  first  is  the  pure 
syntactic  redundancy  of  a  sentential  nature  which  occurs  whenever 
the  search  procedure  fails  to  have  the  property  called  singly 
connectedness  (Wos  et  al.,  1967).  Singly  connectedness  is  almost 
achieved  by  Lock  Resolution  and  this  aspect  of  the  LIP  can  be 
considered  reasonably  well  handled  by  resolution  strategies  that 
include  LR  within  themselves. 

The  second  aspect  of  the  LIP  is  the  lack  of  adequate 
communication  of  partial  search  results  from  one  part  of  the  search 
space  to  other  parts.  Typically,  when  a  resolvent  is  produced,  the 
only  influence  it  has  is  through  subsumption  of  or  by  other 
clauses,  or  through  direct  resolution  with  other  clauses.  Both  of 
these  modes  of  interaction,  but  particularly  resolutions,  become 
impractical  in  a  search  which  is  combinatorially  growing  .  What  is 
needed  is  some  form  of  interaction  that  uses  information  in  a 
clause,  immediately  upon  the  generation  of  the  clause,  to  simplify 
the  rest  of  the  search  space.  This  interaction  could  be  semantic, 
syntactic,  or  both. 

HL-Resolution  does  something  about  both  the  TSP  and  the  LIP 
which  most  other  resolution  strategies  do  not.  It  avoids  producing 
new  term  substitutions  at  the  high  rate  that  other  strategies  do. 


3.3 


Page  50 


Looking  back  at  the  hand  worked  example  of  section  3.2,  we  see  that 
HL-Resolution  produced  no  new  terms  at  all  of  higher  function 
nesting  depth  than  that  which  existed  in  the  input  set.  This  is  in 
distinction  to  the  unrestricted  breadth  first  resolution  searches 
with  which  it  was  compared,  which  by  level  2  had  already  introduced 
a  term  with  2  additional  levels  of  function  nesting.  To  whatever 
degree  HLR  retards  the  introduction  of  new  term  classes  in  the 
search  it  must  be  instead  concentrating  effort  in  the  direction  of 
exploring  the  sentential  connections  of  term  classes  already  in  the 
search  space.  This  is  true  in  a  weak  sense  only,  however.  The 
problem  here  is  that  we  really  need  to  achieve  a  much  stronger 
characterization  (and  concomitant  ability  to  control)  the  ground 
instances  that  a  general  level  clause  represents.  At  present  HLR 
controls  the  allowed  ground  instances  according  to  a  truth 
characterization  evaluated  by  a  model.  There  is  reason  to  believe 
that  HLR  can  be  extended  (semantically  and  syntactically)  so  as  to 
exert  even  more  control  over  the  form  of  the  search,  and  it  is 
expected  that  such  extensions  will  specifically  help  with  the  TSP. 
The  whole  issue  of  proper  term  substitution  is  a  delicate  one  since 
semidecidability  of  the  first  order  logic  limits  what  can  be  done, 
but  yet  it  is  clear  that  current  methods  are  not  yet  optimal  within 
this  limit. 

The  LIP  is  handled  in  a  partial  manner  by  HL-Resolution  as 
formulated.  An  HL-clause  carries  with  itself,  implicitly  in  the 
information  in  the  FSL  set,  a  notion  of  where  it  came  from. 
Remember  that  the  model,  M,  is  used  in  a  manner  similar  to  the  way 
it  would  be  used  in  TMS.  However,  TMS  is  quite  sloppy  with  respect 


Page  51 


to  just  what  instances  of  a  clause  are  the  relevant  ones  in  a 
resolution  step.  This  gives  rise  in  TMS  to  a  search  space  which  at 
the  general  level  may  locally  make  sense,  but  globally  is 
inconsistent  with  the  central  idea  behind  TMS.  In  particular  it  is 
possible,  using  TMS  and  a  model,  M,  to  find  a  refutation  for  a  set 
of  clauses  such  that 

1.  The  general  level  proof  tree  satisfies  The  Model  Strategy 
as  formulated  in  Luckham  (Luckham,  1968),  using  M. 

2.  There  does  not  exist  any  grounding  substitution  of  the 
proof  tree  such  that  the  ground  level  tree  formed  satisfies 
The  Model  Strategy  using  M. 

In  HL-Resolution  it  is  the  FSL  set,  in  conjunction  with  the 
model  that  prevents  such  a  situation  from  occurring  (in  fact,  the 
lock  numbers  are  also  involved  in  this  issue,  but  only 
accidentally,  i.e.  TMS  could  have  this  defect  corrected  without 
the  utilization  of  lock  numbers).  Thus  HL-Resolution  introduces  a 
strong  global  constraint  on  the  nature  of  the  underlying  ground 
proof  trees  that  it  is  searching  over  at  the  general  level.  This 
is  accomplished  by  including  in  the  FSL  set  the  information  which 
is  relevant  concerning  how  that  clause  was  derived.  An  additional 
effect  of  doing  this  is  that  it  helps  reduce  the  number  of  term 
instances  that  can  be  formed. 


Unfortunately ,  the  real  issue  in  the  LIP,  that  of  relating 
information  in  one  part  of  the  search  rapidly  and  effectively  to 
the  rest  of  the  search  has  not  been  handled  very  well  at  all. 
There  is  some  weak  immediate  information  transfer  in  the  sense  that 


3.3 


Page  52 


a  resolvent,  when  generated,  may  result  in  a  change  in  the  model, 
which  then  affects  all  future  model  evaluations.  As  in  the  case  of 
the  TSP,  there  are  indications  here  also  that  HL-Resolution  can  be 
extended  so  as  to  handle  the  LIP  more  effectively. 

This  section  has  forced  a  somewhat  unnatural  conceptual 
distinction  between  the  TSP  and  the  LIP.  They  are  both  really 
reflections  of  a  single  underlying  difficulty  in  resolution  theorem 
proving,  which  is  that  there  is  strong  context  dependency  in  the 
structure  of  a  resolution  search  space  which  is  not  being 
adequately  handled.  HL-Resolution  is  an  initial  attempt,  through 
the  use  of  FSL's  which  are  evaluated  by  a  model,  to  increase  the 
context  size,  or  scope,  of  the  basis  on  which  refinement  strategy 
decisions  are  made. 


Page  53 


3.4  Definition  of  the  Basic  HL-Resolution  Refinement  Strategy 


This  section  defines  the  basic  HL-Resolution  refinement 
strategy  in  a  precise  fashion,  and  is  not  oriented  toward  an 
implementation  viewpoint  and  description.  Notice  that  in  this 
section  in  order  to  make  things  precise,  we  modify  the  notion  of 
HLR  clauses  and  clause  sets.  Input  clause  sets  are  transformed 
into  primitive  representation  HL-clause  sets.  Doing  this  clarifies 
both  the  formal  definition  of  the  HLR  refinement  and  the  proof  of 
its  completeness.  The  refinement  strategy  that  results  is  referred 
to  as  the  basic  HLR  strategy.  This  strategy  is  defined  in  this 
section  in  terms  of  primitive  representation  clauses.  It  is 
important  to  distinguish  between  the  phrases  "basic  HLR  strategy", 
and  "primitive  representation  clauses".  The  former  is  a  refinement 
strategy,  while  the  latter  refers  to  a  notational  scheme  in  which 
this  section  presents  the  basic  HLR  strategy.  Later  in  this  report 
(near  the  end  of  section  3.5)  we  indicate  how  the  notions  of  HLR 
stated  here  relate  to  the  description  of  HLR  given  in  sections  3.1 
and  3-2. 


An  example  of  many  of  the  definitions  used  in  this  and  the 
next  section  are  to  be  found  in  the  Appendix. 

The  usual  notions  of  resolution  are  assumed  familiar  (e.g. 
treating  clauses  as  sets  (and  as  lists)  of  literals,  most  general 
unifier  (mgu),  and  the  notions  of  Herbrand  universe  and  base).  We 
extend  these  notions  in  an  obvious  way  implicitly  in  this  section 
(e.g.  the  mgu  of  two  HL-literals  is  the  mgu  of  their  second 


components).  We  define  a  Herbrand  interpretation  to  be  a  set  of 


3.4 


Page  54 


ground  literals  such  that  every  literal  is  either  an  element  of  the 
Herbrand  base  or  the  negation  of  an  element  of  the  Herbrand  base, 
and  every  element  of  the  Herbrand  base  appears  in  the 
interpretation  exactly  once,  either  directly  or  with  a  negation 
sign.  If  a  literal  is  an  element  of  a  given  Herbrand 
interpretation  then  it  is  said  to  be  true  in  that  interpretation. 
Otherwise  it  is  said  to  be  false  in  that  interpretation. 

HL -Literal 

An  HL-literal  is  a  3-tuple  whose  first  and  third  components 
are  integers.  The  second  component  is  a  normal  literal.  We  refer 
to  the  assignment  of  integer  values  to  the  first  and  third 
components  of  HL-literals  as  a  lock  numbering.  Any  lock  numbering 
of  a  set  of  HL-literals  in  which  all  the  first  component  values  are 
smaller  than  all  of  the  third  component  values  is  called  a 
HL-proper  lock  numbering.  If  all  of  the  integers  used  in  a 
HL-proper  lock  numbering  are  distinct,  the  numbering  is  referred  to 
as  unambiguous.  We  refer  to  an  HL-literal  as  just  a  literal,  and 
specifically  use  the  word  "normal"  when  referring  to  normal 
literals.  The  first  component  of  a  literal  is  called  the  true  lock 
number,  and  the  third  component  is  called  the  false  lock  number. 
The  selector  functions  for  true  and  false  lock  numbers  are  t  and  f, 
respectively.  For  all  definitions  which  define  new  clauses  in 
terms  of  old  clauses  (e.g.  resolving  two  clauses,  or  factoring  a 
clause)  we  assume  that  the  old  clauses  have  been  replaced  by  copies 
of  themselves  in  which  new  unique  variable  names  have  been 
substituted  . 


3.4 


Page  55 


A  literal  is  a  ground  literal  if  no  variable  appears  within 
it.  We  define  the  truth  value  of  a  ground  literal  with  respect  to 
a  Herbrand  interpretation,  M,  to  be  the  same  as  the  truth  value  of 
its  second  component. 


HL-Clause ,  Standard  Literals ,  False  Substitution  Literals 

An  HL-clause,  C,  is  an  ordered  pair  of  sets  of  literals.  The 
first  set  is  denoted  by  SD(C)  and  elements  of  SD(C)  are  called 
standard  literals  of  C.  The  second  set  is  denoted  FSL(C)  and 
elements  of  FSL(C)  are  called  False  Substitution  Literals  of  C.  We 
write 

C  =  <SD( C ) , FSL ( C) > 

as  an  obvious  identity.  Both  standard  literals  and  FSL  literals 
are  HL-literals.  This  report  makes  no  specific  use  of  the  lock 
numbers  of  the  FSL  literals. 

A  grounding  substitution,  G,  for  a  literal  L,  is  any 
substitution  such  that  L(G)  is  a  ground  literal.  A  grounding 
substitution,  G,  for  a  set  of  literals,  S,  is  any  substitution  that 
qualifies  as  a  grounding  substitution  for  each  element  of  S.  A 
grounding  substitution,  G,  for  an  HL-clause,  C,  is  any  substitution 
that  qualifies  as  a  grounding  substitution  for  SD(C)  and  for 
FSL(C).  If  G  is  a  grounding  substitution  for  the  HL-clause,  C, 
then  C(G)  is  a  ground  HL-clause. 


3.4 


Page  56 


In  what  follows  M  will  designate  an  arbitrary  but  fixed 
Herbrand  interpretation.  We  define  an  evaluation  function,  also 
denoted  M,  for  the  Herbrand  interpretation  M,  such  that  if  L  is  a 
set  of  literals 

/t  if  for  all  grounding  substitutions,  THETA,  of  I., 
M(L)  at  least  one  literal  of  L(THETA)  is  true  in  M. 

IF  otherwise 

If  L  happens  to  be  the  set  of  normal  literals  of  a  normal 
clause,  then  the  above  definition  of  M  is  usually  what  is  meant  by 
a  truth  evaluation  of  a  clause  in  resolution  strategies  (e.g.  in 
TMS )  . 

Feasible  Clauses 

A  HL-clause,  C,  is  said  to  be  feasible  (with  respect  to  M) , 
iff  M(FSL(C))  =  F.  A  HL-clause  is  infeasible  iff  it  is  not 
feasible . 

HL-Nul 1  Cl ause 

An  HL-null  clause  is  any  feasible  HL-clause,  C,  such  that 


SD(C)  is  empty. 


age  SI 


a 

a 

i 

r 

* 


Exact  Primitive  Representation 


For  a  given  HL-clause,  C,  we  say  that  the  HL-clause  k  is  an 
exact  primitive  representation  (with  respect  to  M)  of  C  iff: 


1 . 

SD(k) 

=  SD( C ) 

2. 

.FA. 

1 : 

1  e  FSL(C) - > 

1  e 

FSL ( k) 

3. 

.FA. 

1 : 

1  e  SD(  k) - > 

[  1  e 

FSL ( k) 

OR 

@1  e 

FSL(k) ] 

4. 

.FA. 

1: 

1  ^  FSL(k) - > 

[  @1  e 

SD(  C) 

OR 

(  1  e  SD( C )  V  FSL ( C ) ) ] 


5.  k  is  feasible 


■  * 


An  exact  primitive  representation  is  an  HL-clause.  An  HL-clause, 
k,  that  fails  to  satisfy  condition  4,  but  satisfies  the  other 
conditions  is  called  an  inexact  primitive  representation  of  C. 
When  it  is  immaterial  as  to  whether  a  primitive  representation  is 
exact  or  inexact,  it  will  be  called  just  a  primitive 
representation.  Notice  that  if  k  is  a  primitive  representation  of 
C,  then  k  is  an  exact  primitive  representation  of  k. 


Exact  Primitive  Representation  Set 

j-,' 

‘A 


C, 


The  exact  primitive  representation  set  (with  respect  to  M)  of 
is  defined  to  be  the  set  of  all  exact  primitive  representations 


3 . 4 


Pape  58 


M-Evaluations  of  HL-Clauses 


We  now  extend  the  domain  of  the  model  evaluation  function,  M, 
previously  defined  on  sets  of  literals,  to  include  also  feasible 
HL-clauses,  as  follows. 

Let  C  be  a  feasible  HL-clause,  and  let  G(x,y)  be  the  relation 
which  is  true  when  x  is  a  grounding  substitution  for  the 
HL-clause  y.  Then 


if  .FA.  THETA:  [ G (THETA , C ) 
and  M( FSL ( C )  ( THETA ) )  =  F] 
- >  [M(SD(C)( THETA ) )  =  T] 

if  .FA.  THETA:  [G (THETA , C ) 
and  M(FSL(C) (THETA))  =  F] 
-— >  [M(SD(C)  (THETA) )  =  F] 


otherwi se 


The  above  definition  is  for  HL-clauses  in  general.  Notice  that  if 
the  clause  C  is  a  ground  clause  (i.e.  SD(C)  and  FSL(C)  are  ground 
sets  of  literals),  or  if  C  is  a  primitive  representation  clause, 
then  M(C)  will  never  be  "T/F” . 


Selected  True  Literal 

Let  k  be  a  primitive  representation  of  C,  then  for  each 
literal  1  such  that 


1 .  1  e  SD( k) 

2.  @1  e  FSL(k) 


3.4 


Page  59 


3.  .FA.  m:  [m  £  SD(k)  AND 

0m  e  FSL(k) ]  — >  t ( 1 )  <  t(m) 
we  say  that  1  is  a  selected  true  literal  of  k. 

Selected  False  Literal 

If  k  is  a  primitive  representation  of  C,  and 

.FA.  m:  m  £  SD(k)  - >  m  £  FSL(k) 

then  for  each  literal  1  such  that 

1 .  1  £  SD(k) 

2.  .FA.  m:  m  e  SD(k)  - >  f(l)  <  f(m) 

we  say  that  1  is  a  selected  false  literal  of  k. 

Selected  Literal 

Let  k  be  a  primitive  representation  HL-clause.  If  M(k)  =  T 
then  a  selected  literal  of  k  is  any  selected  true  literal  of  k.  If 
M(k)  =  F  then  a  selected  literal  of  k  is  any  selected  false  literal 
of  k . 

Selected  Factor 

Let  k  be  a  primitive  representation  HL-clause  with  a  selected 
literal  L  whose  second  component  is  L'.  If  there  exists  a  set  of 
one  or  more  literals  LI,  L2,  .  .  .  ,  each  distinct  from  L  and 
each  in  SD(k),  and  with  second  components  LI',  L21,  ...  , 


respectively,  such  that  OMEGA  is  the  mgu  of  the  set  of  literals 


Page  60 


L',  LI',  L2',  .  .  .  ,  then  the  clause 


F  =  < (SD(k)  -  LI  -  L2  .  .  .  ),  FSL ( k ) > (OMEGA ) 


is  a  selected  factor  of  k  on  selected  literal  L  iff  F  is  feasible. 
OMEGA  is  called  the  factoring  unifier.  A  reduced  selected  factor, 
RF,  of  k  on  selected  literal  L  is  any  selected  factor  of  k  on 
selected  literal  L  such  that  there  is  no  literal  in  SD(RF)  whose 
second  component  is  identical  to  the  second  component  of  L(OMEGA), 
except  L(OMEGA)  itself,  where  OMEGA  is  the  factoring  unifier. 

All  selected  factoring  is  assumed  to  be  reduced,  and  will  be 
referred  to  simply  as  selected  factoring.  Notice  that,  aside  from 
the  literals  factored  out,  the  factoring  unifier  may  also  cause  two 
or  more  standard  literals  to  have  identical  second  components,  but 
with  none  of  them  equal  to  the  second  component  of  the  selected 
literal  on  which  factoring  is  being  performed.  These  other 
literals  are  all  retained  in  the  factored  clause. 

Let  C  be  a  primitive  representation  clause,  and  L  a  selected 
literal  of  C.  If  there  is  no  other  literal  in  SD(C)  with  a  second 
component  identical  to  the  second  component  of  L,  then  C  is  said  to 
be  a  (reduced)  selected  factor  of  itself  on  selected  literal  L. 


Primitive  Representation  Binary  Resolvent 


Let  C  and  D  be  two  primitive  representation  clauses.  Let  LC 
be  a  selected  false  literal  of  C,  and  LD  a  selected  true  literal  of 
D,  such  that  LC  and  @LD  are  unifiable  with  mgu  THETA.  Then  the 
clause  R  is  a  primitive  representation  binary  resolvent  (on 


Page  61 


literals  LC  and  LD,  and  with  respect  to  model  M)  of  C  against  D 
iff: 

1.  SD( R )  =  ( (SD(C)-LC)  U  (SD(D)-LD) ) (THETA) 

2.  FSL(R)  =  (FSL(C)  U  FSL(D) ) (THETA) 

3.  R  is  feasible  in  M. 

C  is  called  the  false  parent  of  R,  and  D  is  called  the  true  parent 
of  R. 


Basic  Primitive  Representation  Deduction 


We  define  a  basic  primitive  representation  deduction  (with 
respect  to  M)  of  R  from  S  to  be  a  finite  binary  rooted  tree  with: 

1.  all  of  its  leaves  being  primitive  representation  clauses 
in  S; 

2.  each  internal  node  a  primitive  representation  binary 
resolvent  of  selected  factors  of  the  two  nodes  above  it,  with 
the  condition  that  the  two  literals  resolved  upon  in  the 
resolution  are  the  selected  literals  actually  used  as  the 
literals  on  which  the  selected  factoring  occurred; 

3.  R  at  the  root. 

A  basic  primitive  representation  deduction  from  S  of  an  HL-null 
clause  is  called  a  basic  primitive  representation  refutation  of  S. 


3.4 


Pap,e  62 


HL-Assoc iated  Input  Set 

Let  S'  be  a  set  of  HL-clauses,  and  S  a  set  of  normal  clauses. 
Then  S'  is  called  the  HL-associated  input  set  (with  lock  numbering 
LN)  for  S  iff: 

1.  LN  is  a  HL-proper  lock  numbering  of  S'; 

2.  there  is  a  1-1  mapping,  C,  from  S  to  S',  and  a  1-1  mapping 
L,  from  normal  literals  of  S  to  literals  of  S'  such  that  for 
all  x  e  S  and  for  all  k  we  have: 

a)  L ( k)  e  SD(C( x) )  iff  k  e  x 

b)  k  e  x  - >  k  equals  the  second  component  of  L(k); 

3.  for  all  u,  u  £  S'  - >  FSL(u)  is  empty. 

Exact  Primitive  Representation  HL-Associated  Input  Set 

The  exact  primitive  representation  HL-associated  input  set  of 
S  is  a  set,  HLS,  of  exact  primitive  representation  clauses  formed 
by  replacing  each  clause,  C,  in  the  HL-associated  input  set  for  S 
by  the  exact  primitive  representation  set  of  C.  In  HLS  there  will 
in  general  be  many  pairs  of  clauses  that  share  the  same  lock 
numbers.  If  the  literals  in  clauses  in  HLS  have  their  lock  numbers 
re-assigned  so  that  HLS  has  an  unambiguous  HL-proper  lock 
numbering,  then  we  say  that  HLS  has  been  disambiguated. 

This  completes  the  set  of  definitions  needed  for  the  basic  HLR 
strategy  expressed  in  primitive  representation  form.  The  next 
section  states  and  proves  the  soundness  and  completeness  theorem 
for  this  strategy. 


3.5 


Page  63 


3.5  Soundness  and  Completeness  of  HL-Resolution 


This  section  first  states  a  soundness  and  completeness  theorem 
for  basic  primitive  representation  HL-Resolution,  and  then  gives  a 
proof  of  this  theorem.  Finally  it  is  indicated  how  the  basic 
primitive  representation  results  are  related  to  the  basic 
non-primitive  representation  HL-Resolution.  Refer  to  the  Appendix 
for  an  example  of  some  of  the  notions  used  in  this  section. 


3.5 


Pape 


PROOF 

By  normal  resolution  will  be  meant  unrestricted  binary 
resolution  with  implicit  factoring.  Normal  resolution  is  known  to 
be  sound  and  complete. 

soundness 

It  is  to  be  shown  that  if  S  is  satisfiable,  then  there  exists 
no  basic  primitive  representation  refutation  of  HLS .  The  proof  is 
by  contradiction.  We  assume  both  the  satisfiability  of  S  and  the 
existence  of  a  basic  primitive  representation  refutation  of  HLS, 
which  we  denote  by  R-HLS.  We  form  the  tree  R'  from  the  tree  R-HLS 
by  replacing  each  clause  in  R-HLS  by  the  set  (actually  list)  of 
second  components  of  its  standard  literals.  By  looking  at  the 
definitions  of  basic  HLR  at  the  primitive  representation  level,  it 
should  be  clear  that  the  new  tree,  R',  is  in  fact  a  normal 
resolution  refutation  of  S.  But  normal  resolution  is  sound,  so 
that  we  have  the  contradiction.  Thus  basic  primitive 
representation  resolution  is  sound. 

completeness 

For  ease  of  reference  the  characteristics  of  the  clause  sets 
and  other  items  involved  in  the  completeness  proof  are  listed  here. 

S  is  a  set  of  normal  general  level  clauses  which  is 


unsatisfiable. 


Page  65 


is  a  set  of  normal  ground  clauses  which  are  ground 
instances  of  clauses  in  S,  and  is  minimally 
unsatisfiable . 


GR(S,SG)  is  a  list  of  sets  of  substitutions  which,  when 
applied  to  S  yields  SG. 


is  a  set  of  general  level  exact  primitive 
representation  clauses,  which  is  the  exact  primitive 
representation  HL-associated  input  set  of  S  (with 
respect  to  model  M,  and  with  HL-proper  lock 
numbering  LN).  We  assume  for  now  that  HLS  is  not 
ambiguously  numbered.  Later  this  requirement  is 
dropped . 


HLSG 


R-SGL 


R-HLSG 


R-HLS 


is  a  set  of  ground  exact  primitive  representation 
clauses  formed  by  applying  GR(S,SG)  to  HLS. 


is  a  set  of  ground  lock  normal  clauses  identical  to 
SG  except  that  lock  numbers  have  been  added  to  the 
literals  in  SGL  in  a  specific  way  relative  to  the 
structure  of  HLSG. 


is  a  Modified  form  of  Lock  Resolution. 


is  a  refutation  of  SGL  according  to  MLR, 


is  a  ground  basic  primitive  rep>  esentation 
refutation  of  HLSG. 


is  a  general  level  basic  primitive  representation 
refutation  of  HLS. 


The  completeness  proof  requires  that  we  establish  the 
existence  of  a  basic  primitive  representation  refutation  of  HLS 
based  upon  the  unsatisfiability  of  S,  and  that  this  be  done  for  an 
arbitrary  Herbrand  interpretation,  M,  as  the  model,  and  for  an 
arbitrary  HL-proper  lock  numbering,  LN ,  of  HLS.  The  notion  of 
satisfiability  of  HLS  is  neither  defined  nor  used  in  this  proof. 

Let  S,  HLS,  LN  and  M  be  as  stated  in  the  hypothesis  of  the 
theorem.  For  the  completeness  proof  we  further  assume  that  S  is 
unsatisf iable .  We  will  consider  S  to  be  a  non-ground  set  of 
clauses  and  state  the  proof  for  that  situation.  If  S  is  a  ground 
set  of  clauses  then  a  large  part  of  the  proof  given  here  is 
superfluous,  and  the  reader  should  be  able  to  identify  which  parts 
are  relevant  to  treating  the  ground  case.  The  following  is  an 
outline  of  the  proof  steps. 

1.  Preliminary  definitions  and  lemma  statements. 

2.  Herbrand's  theorem  applied  to  S  yields  a  ground  set  SG , 
which  is  minimally  unsatisfiable . 

3.  A  set  of  ground  clauses,  HLSG,  each  of  which  is  a 
strict  grounding  of  a  clause  in  HLS,  is  defined  and  related  to 
clauses  in  SG .  A  lock  numbered  set  of  clauses,  SGL ,  is  formed 
from  SG . 

4.  A  Modified  form  of  Lock  Resolution  (MLR)  is  defined  and 
applied  to  SGL  to  yield  a  refutation  of  SGL,  called  R-SGL. 
R-SGL  is  then  transformed  into  a  basic  primitive 
representation  refutation  of  HLSG,  called  R-HLSG. 


3.5 


Page  67 


5.  The  Lifting  Lemma  is  used  to  transform  R-HLSG  into  a 
general  level  refutation,  R-HLS,  of  HLS  which  is  a  basic 
primitive  representation  refutation  of  HLS. 

6.  The  Lifting  Lemma  is  proved. 

In  the  completeness  proof  it  is  best  to  think  of  clauses  as 
"bags"  or  "heaps"  of  literals,  as  opposed  to  ordinary  sets  of 
literals. 

Step  1  . 

The  following  lemma  will  be  used  in  step  6  in  the  proof  of 
Lemma  IV. 

LEMMA  I 

Let  L  be  a  set  of  literals 

A 1 , A2 ,  .  .  .  Am,B1,B2,  ...  Bn 

and  SIGMA  a  grounding  substitution  for  L  such  that 

AKSIGMA)  =  A2CSIGMA)  =  .  .  .  =  Am(SIGMA)  . 

If  LAMBDA  is  the  mgu  of  the  set  of  literals  A1,A2,  .  .  .  Am, 

then  there  exists  a  substitution,  RHO,  such  that  SIGMA  = 
(LAMBDA) (RHO) . 


We  do  not  prove  this  lemma. 


3.5 


Page  68 


* 

Strict  Grounding 

Let  U  be  a  primitive  representation  clause  and  V  a  ground 
primitive  representation  clause.  Then  V  is  a  strict  grounding  of  U 
under  substitution  RHO,  relative  to  the  Herbrand  interpretation  M, 
iff  all  of  the  following  conditions  are  met: 

1.  V  is  feasible  with  respect  to  M. 

2.  There  exists  a  total  1-1  mapping,  g,  from  SD(U)  onto 
SD(V)  such  that  for  all  k  e  SD(U)  it  is  the  case  that  g(k)  is 
identical  to  k(RHO)  in  all  three  components. 

3.  There  exists  a  total  1-1  mapping,  h,  from  FSL(U)  onto 
FSL(V)  such  that  for  all  k  e  FSL(U)  it  is  the  case  that  h(k) 
is  identical  to  k(RHO)  in  all  three  components. 

If  a  literal  of  U  and  a  literal  of  V  map  to  each  other  under  f 
or  g,  then  those  two  literals  are  said  to  be  naturally  associated 
(by  g  or  by  h) . 

LEMMA  II 

Let  x  be  a  strict  grounding  of  the  primitive  representation 
clause  X  under  substitution  TAU.  If  x'  is  a  selected  true  (false) 
literal  of  x,  then  for  every  literal  X’  in  X  such  that  X'(TAU)  =  x' 
(i.e.  equality  for  all  three  components) ,  it  is  the  case  that  X' 
is  a  selected  true  (false)  literal  of  X.  Furthermore  there  will 
exist  at  least  one  X'  in  X  such  that  X'(TAU)  =  x'. 


3.5 


Page  69 


PROOF  OF  LFMMA  II 

Case  1:  x'  is  a  selected  true  literal  in  x. 

The  proof  is  by  contradiction ,  where  we  assume  x*  is  a 

selected  true  literal  of  x  and  that  there  exists  some  X*  in  X  such 

that  X'(TAU)  =  x',  and  X'  is  not  a  true  selected  literal  of  X.  If 

x'  is  a  selected  true  literal  in  x,  then  @x'  is  an  element  of  the 
FSL  of  x,  and  by  the  definition  of  a  strict  grounding  there  exists 
at  least  one  literal  @X ' '  in  the  FSL  of  X  such  that  @X*'(TAU)  = 
@x'  ,  and  at  least  one  standard  literal  X'1  such  that  X'*(TAU)  =  x', 
where  equality  between  literals  here  means  that  they  are  identical 
in  all  three  components.  Let  X'  be  any  specific  literal  in  X  such 
that  X'(TAU)  =  x' .  By  the  definition  of  a  strict  grounding,  x  must 
be  feasible.  This  means  that  @X '  e  FSL(X).  Assume  now  that  X'  is 
not  a  selected  true  literal  of  X.  By  the  definition  of  a  selected 
true  literal  the  only  way  that  X’  can  fail  to  be  a  selected  true 
literal  of  X  is  if  there  exists  a  standard  literal  L  in  X  such  that 
@L  is  in  the  FSL  of  X,  and  the  true  lock  number  of  L  is  less  than 
the  true  lock  number  of  X'.  But  if  this  were  the  case,  then  there 
would  exist  a  standard  literal,  L(TAU),  and  an  FSL  literal, 

PL(TAU),  both  in  x,  and  the  true  lock  number  of  L(TAU)  would  be 
less  than  the  true  lock  number  of  x’.  But  then  xr  could  not  be  a 
selected  literal  of  x.  Therefore  we  have  the  contradiction,  and  X' 
must  be  a  selected  literal  of  X. 

Case  2:  x'  is  a  selected  false  literal  in  x. 


Similar  to  case  1 . 


3.5 


Page  70 


Immediate  Primitive  Representation  Deduction 

Let  w,  u  and  v  be  primitive  r epresentation  clauses.  An 
immediate  primitive  representation  deduction  of  w  from  u  and  v  is  a 
basic  primitive  representation  deduction  tree  with  u  and  v  as 
leaves,  and  w  at  the  root,  and  containing  no  other  clauses.  If 
there  is  an  immediate  primitive  representation  deduction  of  w  from 
u  and  v,  then  w  is  said  to  be  immediately  deducible  from  u  and  v. 


Ground  Normal  Lock  Image 

Let  x  be  a  ground  clause  that  consists  of  a  set  of  ground 
normal  literals  with  lock  numbers,  i.e.  each  literal  is  an  ordered 
pair  in  which  the  first  component  is  a  normal  ground  literal,  and 
the  second  component  is  a  number.  Let  X  be  a  ground  primitive 
representation  clause.  Then  x  is  a  ground  normal  lock  image  of  X 
iff  there  exists  a  function,  g  (not  necessarily  unique),  which  is  a 
1-1  total  mapping  from  standard  literals  of  X  onto  literals  of  x 
such  that  the  following  holds: 

1 .  L  £  SD ( X )  and  L  e  FSL(X) 

- >  g(L)  =  <second  component  of  L,  f(L)> 

2.  L  t  SD ( X )  and  @L  e  FSL(X) 

- >  g(L)  =  <second  component  of  L,  t(L)> 

where  t  and  f  are  the  selector  functions  for  the  true  and  false 
lock  numbers  of  HL-literals.  The  function  g  is  called  the  ground 


normal  lock  function. 


3.5 


Page  71 


The  next  lemma  uses  the  MLR,  which  is  defined  in  step  of  the 
I  completeness  proof. 

LEMMA  III 

I 

Let  X  and  Y  be  ground  primitive  representation  clauses,  and  x, 
y  be  ground  normal  lock  images  of  X  and  Y  respectively.  If  there 
|  exists  a  clause  z,  which  is  a  resolvent  of  x  and  y  according  to 

MLR,  then  there  exists  an  immediate  primitive  representation 
deduction  of  a  clause,  Z,  from  X  and  Y,  such  that  z  is  a  ground 
normal  lock  image  of  Z. 

I 

PROOF  OF  LEMMA  III 

We  do  not  give  a  detailed  proof  here,  but  just  indicate  how  to 
proceed . 

First  we  realize  that  some  specific  model,  M,  must  exist,  in 
1  order  for  X  and  Y  to  be  well  defined.  This  is  because  a  primitive 

representation  clause  is  by  definition  a  feasible  clause,  and  so  is 
defined  relative  to  some  model.  Also  we  can  assume  a  ground  normal 
lock  function,  G1 ,  exists  (not  necessarily  unique)  which  associates 
literals  from  X  to  x,  according  to  the  definition  of  a  ground 
normal  lock  image.  Similarly  for  a  G 2  associating  literals  from  Y 
to  y.  Then,  for  the  z  which  exists  by  the  lemma  hypothesis,  we  can 
construct  a  set  of  HL-literals  (using  G1  and  G2)  to  represent  the 
standard  literals  of  clause  Z.  An  FSL  set  for  the  clause  Z  is 


simply  the  union  of  the  FSL  sets  of  X  and  Y.  Now  it  is  necessary 
to  show  that  this  constructed  Z  actually  can  be  at  the  root  of  an 


3.5 


Pa Re  72 


immediate  primitive  representation  deduction  with  X  and  Y  as 
leaves.  This  requires  checking  the  definitions  of  MLR  and  HLR  and 
seeing  that  everything  done  to  construct  z  from  ground  normal 
locked  literals  in  x  and  y  according  to  MLR,  is  also  allowed  in  HLR 
to  produce  Z  from  the  corresponding  ground  HL-literals  in  X  and  Y. 
Specifically  the  factoring  of  MLR  corresponds  to  the  reduced 
selected  factoring  of  HLR  on  ground  clauses.  Also,  every  selected 
literal  in  x  under  MLR  is  mapped  to  a  selected  literal  of  X  in  HLR 
by  the  function  G1 .  Similarly  for  G 2  (if  we  were  doing  this  in 
detail  this  would  require  a  lemma  statement  similar  to  Lemma  II). 
Finally  the  constructed  Z  must  be  feasible  since  it  has  an  FSL 
which  is  just  the  union  of  FSL  sets  which  consist  of  only  ground 
literals,  and  which  came  from  clauses  that  were  feasible. 

LEMMA  IV 

( ground  to  general  level  lifting  lemma) 

Let  X,Y,x,y,z  be  primitive  representation  clauses,  in  which  X 
and  Y  have  their  variables  standardized  apart,  and  with  x  a  strict 
grounding  of  X  under  substitution  TAU,  and  y  a  strict  grounding  of 
Y  under  substitution  NU.  If  it  is  the  case  that  there  exists  an 
immediate  primitive  representation  deduction  of  z  from  x  and  y, 
then  there  exists  an  immediate  primitive  representation  deduction 
of  a  clause  Z,  from  X  and  Y,  such  that  z  is  a  strict  grounding  of 
Z. 

PROOF  OF  LEMMA  IV  (see  step  6) 


3.5 


Page  73 


Step  2 . 

S  is  an  unsatisfiable  set  of  normal  clauses,  and  we  assume 
that  the  clauses  in  S  have  their  variables  standardized  apart. 
Consider  S  as  ordered  so  that  we  may  speak  of  the  k-th  clause  in  S. 

By  Herbrand's  theorem  (Chang  and  Lee,  1973)  there  exists  a  set 
of  ground  clauses,  SG ,  each  of  which  is  an  instance  of  a  clause  in 
S,  and  SG  is  minimally  unsatisfiable  (in  particular  SG  does  not 
contain  two  clauses  which  are  identical).  The  substitutions  which 
convert  S  to  SG  are  denoted  by  GR(S,SG).  GR(S,SG)  is  a  list  of 
sets  of  substitutions,  such  that  the  k-th  element  of  GR(S,SG)  is  a 
set  of  j(k)  distinct  substitutions  to  be  applied  to  the  k-th  clause 
in  S,  generating  j(k)  distinct  ground  instances  of  that  clause  in 
SG.  We  do  not  delete  duplicate  literals  in  any  clause  in  SG.  A 
clause  x  in  S  is  said  to  be  naturally  associated  with  a  clause  y  in 
SG  iff  x  is  the  k-th  clause  in  S,  for  some  k,  and  x(TAU)  =  y  for 
some  TAU  in  the  k-th  element  of  GR(S,SG).  The  symmetric  relation 
"naturally  associated"  defines  a  function  which  is  many-one  from  SG 
into  S,  and  is  total  on  SG.  If  x  in  S  and  x'  in  SG  are  naturally 
associated,  we  define  an  extension  of  the  naturally  associated 
relation  to  include  a  natural  association  of  literals  L  in  x  and  L' 
in  x'  in  the  obvious  way.  The  natural  association  of  literals  in  x 
and  x'  defines  a  total  1-1  mapping  from  x  onto  x'. 

Notice  that  Herbrand's  theorem  does  not  imply  a  unique  SG.  We 
choose  any  minimally  unsatisfiable  one  for  the  completeness  proof. 
Having  S  and  a  specific  SG  still  does  not  necessarily  give  a  unique 
GR(S,SG).  We  are  free  to  choose  any  specific  GR(S,SG)  for  the 


Page  74 


completeness  proof.  Having  SG  and  GR(S,SG)  specific  allows  a 
unique  relation  of  naturally  associated  to  be  defined  on  pairs  of 
clauses,  one  in  S  and  one  in  SG .  However  the  naturally  associated 
relation  on  pairs  of  literals  is  still  not  unique,  and  we  are  free 
to  choose  any  specific  one. 


Step  3. 


We  are  now  going  to  define  a  set,  HLSG,  which  is  a  set  of 
ground  exact  primitive  representation  clauses.  Each  clause  in  HLSG 
will  be  a  strict  grounding  of  some  clause  in  HLS. 

Assume  that  the  k-th  element  of  GR(S,SG)  is  non-empty.  Let  c 
be  the  k-th  clause  in  S,  and  EPRS(c)  the  set  of  exact  primitive 
representations  in  HLS  for  the  normal  clause  c  in  S.  Notice  that 
this  is  actually  a  two  stage  connection.  First  there  is  a  unique 
clause,  c',  in  the  HL-assoc iated  input  set  of  S,  which  is  the 
non-primitive  representation  HL-clause  corresponding  to  c.  Then 
EPRS(c)  is  the  exact  primitive  representation  set  of  c'.  Any 
element  of  EPRS(c)  is  said  to  correspond  to  c.  Assume  each  exact 
primitive  representation  clause  in  EPRS(c)  uses  the  same  variable 
names  as  are  used  in  c.  Let  RHO  be  an  element  of  the  k-th  list 
element  of  GR(S,SG).  Then  applying  RHO  to  each  member  of  EPRS(c) 
will  result  in  precisely  one  feasible  ground  clause,  which  we 
denote  by  apply(RHO,  EPRS(c)).  The  clause  apply(RHO,  EPRS(c))  will 
be  considered  as  a  strict  grounding,  i.e.  no  deletion  of  identical 
literals  will  be  done. 


3.5 


Page  75 


The  set 

HLSG  is 

the  set 

of  ground 

exact  primitive 

representation 

clauses 

which 

are  generated 

by  applying 

all 

substitutions 

contained 

in  all  of 

the  elements  of 

GR (S , SG)  to 

the 

appropriate  clauses  in  HLS.  Each  clause  in  HLSG  is  ground,  is  an 
exact  primitive  representation  clause,  and  is  a  strict  grounding  of 
a  clause  in  HLS. 

Again  we  extend  the  naturally  associated  relation,  this  time 
to  relate  normal  clauses  (and  normal  literals)  in  SG  to  clauses 
(and  standard  literals)  in  HLSG.  We  relate  the  clause  x  in  SG  to 
the  clause  y  in  HLSG  iff  there  exists  clauses  u,  v  and  substitution 
OMEGA  with  the  following  properties: 

1.  there  exists  a  k  such  that  u  is  the  k-th  clause  of  S,  OMEGA 
is  an  element  of  the  k-th  element  of  GR(S,SG),  and  u(OMEGA)  =  x. 

2.  v  is  an  exact  primitive  representation  clause  in  HLS  which 
corresponds  to  u,  and  v(OMEGA)  =  y. 


The  pairing  of  clauses  between  SG  and  HLSG  is  unique  (if  a  specific 
choice  has  been  made  for  GR(S,SG)).  This  pairing  of  clauses  by  the 
naturally  associated  relation  constitutes  a  1-1  total  function 
between  SG  and  HLSG. 


The  naturally  associated  relation  between  literals  in  clauses 
which  are  naturally  associated  is  defined  in  the  obvious  way  by 
noting  that  a  clause  in  SG  is  identical  to  the  set  (actually  "bag") 
of  second  components  of  the  standard  literals  of  f‘  clause  with 
which  it  is  naturally  associated  in  HLSG.  Thus  for  x  in  SG ,  and  y 


3.5 


Pape  76 


in  HLSG ,  if  x  and  y  are  naturally  associated  then  there  will  exist 


a  total  1-1 

mapping  of 

literals 

from  x  to  literals 

of  y . 

The 

pairing  of 

literals 

between 

SG  and  HLSG  is  not 

unique . 

We 

arbitrarily 

choose  any 

one  of  them  for  the  completeness 

proof. 

We 

note  here  that  this  lack  of  a  unique  extension  of  the  naturally 
associated  relation  on  literals  is  not  essential,  and  that  by 
keeping  careful  track  of  the  identity  (or  source)  of  literals  in  S, 
SG,  HLS,  and  HLSG,  we  could  have  arrived  at  this  point  in  the  proof 
with  a  unique  pairing  of  literals.  This  was  deemed  an  unnecessary 
complication  since  the  completeness  proof  does  not  require  it. 

Now  we  define  SGL  to  be  a  copy  of  SG  in  which  each  literal,  L, 
of  SGL,  is  assigned  a  lock  number  in  the  following  way. 

If  L'  is  the  literal  in  HLSG  which  is  naturally  associated 
with  L,  we  assign  to  L  the  true  lock  number  of  L'  if  L'  is  true 
in  M,  else  we  assign  the  false  lock  number  of  L'  to  L. 

In  the  above  the  relation  naturally  associated  on  clauses  and 
literals  of  SGL  to  clauses  and  literals  of  HLSG  has  been  assumed  to 
be  defined  in  the  obvious  way. 

Each  clause  in  SGL  is  a  ground  normal  lock  image  of  the  clause 
in  HLSG  with  which  it  is  naturally  associated. 

Step  4^ 

We  define  MLR  to  be  a  Modified  Lock  Resolution  refinement 
strategy  for  ground  sets  of  arbitrarily  lock  numbered  normal 
clauses.  MLR  is  similar  to  ordinary  Lock  Resolution  except  with 


3.5 


Page  77 


respect  to  factoring.  No  explicit  factoring  is  allowed,  which 
means  at  the  ground  level  that  the  input  set  of  clauses  does  not 
have  duplicate  literals  removed,  and  when  resolvents  are  formed 
there  will  be  no  merging  of  duplicate  literals  in  the  resolvent. 
There  is,  however,  mandatory  implicit  factoring  of  the  following 
form.  When  resolving  clause  Cl  on  literal  LI  with  clause  C2  on 
literal  L2,  all  literals,  LI*,  in  Cl  which  are  identical  (except 
possibly  for  lock  number)  to  LI  are  implicitly  factored  away. 
Similarly  for  all  literals  L2*  in  C2  identical  to  L2.  MLR  is  a 
complete  ground  refinement  strategy.  This  can  be  proved  easily  in 
several  ways,  one  of  which  is  the  same  as  is  used  to  prove  LR 
complete  (Chang  and  Lee,  1973).  Thus  there  will  exist  a  MLR 
refutation,  R-SGL ,  of  SGL . 

Now  we  transform  the  refutation  tree  R-SGL  by  replacing  the 
ground  normal  lock  clause  at  each  node  by  a  ground  primitive 
representation  clause  in  the  following  way.  If  the  node  is  a  leaf, 
and  x  e  SGL  is  the  clause  there,  then  replace  it  by  the  naturally 
associated  exact  primitive  representation  clause  of  x  which  is  in 
HLSG.  For  each  internal  node,  k,  of  R-SGL  which  has  not  had  its 
clause  replaced,  but  whose  parents  have  had  their  clauses  replaced, 
we  apply  Lemma  III,  which  asserts  the  existence  of  a  primitive 
representation  clause  which  is  immediately  deducible  form  the  new 
clauses  at  the  parents  of  k.  Furthermore  this  deduced  clause  has 
the  original  clause  at  node  k  as  a  ground  normal  lock  image,  and 
thus  it  becomes  the  new  clause  at  node  k.  We  continue  this  process 
of  clause  replacement,  by  virtue  of  Lemma  III,  in  a  breadth  first 
manner,  i.e.  first  all  level  1  nodes  are  replaced,  then  all  level 


3.5 


Page  78 


2  nodes,  etc.,  until  the  root  is  finally  replaced.  At  this  point 
we  recognize  that  at  the  root  of  the  transformed  tree  must  be  a 
ground  HL-null  clause,  since  the  new  clause  at  the  root  has  a 
ground  normal  lock  image  which  has  no  literals. 

Thus  we  have  shown  the  existence  of  a  basic  primitive 
representation  refutation  of  the  ground  set  HLSG,  and  we  call  this 
refutation  R-HLSG. 

Step  5 . 

Now  we  transform  R-HLSG  into  a  general  level  refutation  of 
HLS,  called  R-HLS.  This  is  done  one  level  at  a  time  by  first 
replacing  each  leaf  clause,  x,  of  R-HLSG  by  the  clause  y  in  HLS 
such  that  x  is  a  strict  grounding  of  y.  We  remember  that  HLSG  was 
construced  as  a  set  of  strict  groundings  of  HLS,  so  such  a  y  will 
exist,  and  will  in  fact  be  unique.  Now,  by  a  process  completely 
analogous  to  the  transformation,  in  the  previous  step,  of  R-SGL 
into  R-HLSG,  we  apply  Lemma  IV  repeatedly  to  transform  R-HLSG  into 
R-HLS.  The  new  clause,  x,  that  replaces  the  old  root  clause,  y, 
must  be  an  HL-null  clause,  since  y  had  no  standard  literals  and  y 
is  a  strict  grounding  of  x.  The  resulting  new  tree  is  a  general 
level  basic  primitive  representation  refutation  of  HLS. 

Thus  we  have  shown  the  existence  of  a  basic  primitive 
representation  refutation  of  HLS  based  on  the  unsatisfiability  of 
the  set  S.  The  proof  of  completeness  now  just  requires  the  proof 
of  Lemma  IV,  the  Lifting  Lemma. 


3.5 


Page  79 


Step 

We  repeat  the  statement  of  Lemma  IV  here. 

LEMMA  IV:  Let  X,Y,x,y,z  be  primitive  representation  clauses, 

in  which  X  and  Y  have  their  variables  standardized  apart,  and  with 
x  a  strict  grounding  of  X  under  substitution  TAU,  and  y  a  strict 
grounding  of  Y  under  substitution  NU.  If  it  is  the  case  that  there 
exists  an  immediate  primitive  representation  deduction  of  z  from  x 
and  y,  then  there  exists  an  immediate  primitive  representation 
deduction  of  a  clause  Z,  from  X  and  Y,  such  that  z  is  a  strict 
grounding  of  Z. 

PROOF  OF  LEMMA  IV 

We  will  need  ma'nly  to  concentrate  on  the  standard  literals  of 
clauses . 

Let 


X  = 

xl ,x2 , 

.  .  .  xn; 

FSL 

y  = 

yi,y2, 

.  .  .  ym ; 

FSL 

z  = 

z  1  ,  z  2  , 

.  .  .  zk; 

FSL 

By  the  hypotheses  of  the  lemma,  x  and  y  resolve  to  produce  z. 
Without  loss  of  generality  we  assume  that  x  is  the  falsra  c? 'use  and 
y  the  true  clause,  and  that  xl  and  yl  are  the  selected  1  i  ^  1. 

used  in  the  resolution  of  x  against  y  to  yield  z.  Thus  xl  ,*  , 

where  the  negation  sign  is  taken  to  be  an  operator  iat 

syntactically  "inverts"  the  negation  status  of  a  literal.  Leu  Fx 


3.5 


Page  80 


be  the  set  of  literals  in  x  that  have  the  same  second  components  as 
xl ,  including  xl  itself.  Let  Fy  be  the  similar  set  relative  to 
literal  yl  for  clause  y.  Then  the  immediate  primitive 
representation  deduction  of  z  from  x  and  y  must  produce  a  z  with 
the  following  structure: 

SD( z)  =  (SD( x)  -  Fx)  \J  ( SD( y )  -  Fy) 

FSL(z)  =  FSL(x) \J  FSL(y) 

The  clause  z  is  of  necessity  feasible  since  x  and  y  were  feasible 
ground  clauses.  By  the  lemma  hypothesis  x  is  a  strict  grounding  of 
X  (under  TAU)  and  y  is  a  strict  grounding  of  Y  (under  NU).  This 
allows  us  to  assume  the  existence  of  a  (not  necessarily  unique) 
naturally  associated  relation  between  literals  of  X  and  x,  and  also 

r- 

between  literals  of  Y  and  y.  This  relation  is  defined  by  the  W 
functions  g  and  h  which  must  exist  for  the  definition  of  a  strict 
grounding  to  apply.  The  functions  g  and  h,  and  therefore  this 
definition  of  the  naturally  associated  relation  are  not  necessarily 
unique.  Let  XI  be  the  literal  in  X  naturally  associated  with  xl  in 
x,  and  FX  the  set  of  literals  of  X  whose  naturally  associated 
literals  in  x  are  in  Fx.  Similarly  for  Yl  and  FY.  Let 


X 

II 

X 

X 

f\J 

c 

X 

• 

• 

FSL 

Y  =  Yl ,Y1  ,  . 

.  .  Ym; 

FSL 

have  their  literals  written  in  the  same  order  as  the  naturally 
associated  literals  in  their  strict  groundings,  X(TAU)  and  Y(NU), 
respectively.  Now,  knowing  that  xl  =  @y1  we  can  assert  that  there 
exists  a  unifier  of  the  set  FX  \J  FY  (ignoring  negation  signs).  By 


3.5 


Page  81 


the  Lemma  hypothesis  X  and  Y  have  their  variables  standardized 
apart.  Thus  we  can  assert  by  Lemma  I  for  any  mgu,  OMEGA,  of  the 
set  FX  \J  FY  (ignoring  negation  signs),  there  exists  a  substitution 
SIGMA  such  that  for  1  _<  i  <  n  and  1  £  j  <  m 

Xi(OMEGA) (SIGMA)  =  xi 
Yj(OMEGA) (SIGMA)  =  yj 

By  Lemma  II  we  can  assert  that  XI  is  a  selected  false  literal 
of  X,  and  Y1  is  a  selected  true  literal  of  Y.  This  means  that  the 
clause 

X'  =  ( (SD(X )  -  FX)L/X1)  (OMEGA);  FSL  =  —  (OMEGA) 

is  a  reduced  selected  factor  of  X,  and  is  feasible.  Similarly  for 
a  Y*.  Therefore  there  will  exist  a  primitive  representation  binary 
resolvent,  Z,  of  X'  and  Y'  on  selected  literals  XI(OMEGA)  and 
Y1 (OMEGA),  in  X'  and  Y*  respectively.  This  constitutes  an 
immediate  primitive  representation  deduction  of  Z,  from  X  and  Y,  on 
selected  literals  XI  and  Y1,  respectively.  Specifically  Z  is 
defined  by 

SD( Z)  =  (SD(X )  -  FX ) (OMEGA  )  V  (SD(Y)  -  FY ) (OMEGA ) 


FSL(Z)  =  FSL(X) (OMEGA)  \J  FSL (Y ) (OMEGA ) 


3.5 


Page  82 


From  the  construction  of  Z  it  should  be  realized  that 

SD( Z) (SIGMA)  =  SD( z) 


and 


FSL(Z) (SIGMA)  =  FSL(z) 

Furthermore  the  clause  Z  is  feasible  since  z  is  feasible.  Thus 
there  is  an  immediate  primitive  representation  deduction  of  Z  from 
X  and  Y,  such  that  z  is  a  strict  grounding  of  Z.  This  proves  the 
Lifting  Lemma. 


The  above  completeness  proof  shows  the  existence  of  a  basic 
primitive  representation  refutation  of  HLS  based  on  the 
unsatisf iabl il ity  of  S.  It  was  assumed  in  the  statement  of  the 
soundness  and  completeness  theorem  that  HLS  had  been  disambiguated. 
The  proof  of  completeness  itself  is  a  valid  argument  independently 
of  whether  HLS  is  disambiguated  or  not.  A  stronger  refinement 
results  if  HLS  is  disambiguated.  However,  in  order  to  relate  the 
basic  primitive  representation  HLR  to  the  basic  non-primitive 
representation  HLR,  HLS  should  be  left  ambiguous.  This 
relationship  involves  the  notion  of  a  non-primitive  representation 
clause  "standing  for",  or  containing  within  itself  several  distinct 
primitive  representation  clauses.  Such  a  relationship  is  analogous 
to  the  relationship  of  a  general  level  clause  to  the  ground 
instances  it  represents  or  contains.  In  order  to  support  the 
completeness  of  HLR  at  the  non-primitive  representation  level,  we 
would  need  to  use  something  of  the  form: 


3.5 


Page  83 


LEMMA  V:  (primitive  representation  to  non-primitive 
representation  "lifting"  lemma) 

Let  u,  v  be  primitive  representations,  respectively,  of 
HL-clauses  U  and  V.  If  there  is  an  immediate  primitive 
representation  deduction  of  w  from  u  and  v,  then  there  is  an 
immediate  HL-deduction  of  a  clause  W  from  U  and  V  such  that  w  is  a 
primitive  representation  of  W. 

Such  a  lemma  statement  requires  that  "immediate  HL-deduction" 
be  defined.  This  requires  producing,  at  the  non-primitive 
representation  level,  a  complete  set  of  definitions  of 
HL-Resolution ,  much  as  has  been  done  in  section  3. 4  for  primitive 
representations . 

We  will  not  actually  carry  out  this  program  of  development 
here  for  the  following  reasons: 

1.  We  are  not  dwelling  on  implementation  issues  in  this 
report,  and  even  if  we  were  it  is  not  clear  that  the 
prescription  in  sections  3.1  and  3.2  are  the  best  "lifted" 
versions  of  the  basic  primitive  representation  strategy  stated 
here . 

2.  HL-Resolution  is  being  stressed  as  a  theoretical 
refinement  strategy  in  this  report,  and  in  this  vein  the 
primitive  representation  viewpoint  lays  bare  more  of  the 
structure  of  the  strategy,  and  is  thus  the  level  at  which  the 
strategy  should  be  studied. 


3.5 


Page  84 


3.  Further  refinements  and  extensions  of  HLR  will  undoubtably 
best  be  initially  phrased  at  the  primitive  representation 
level,  and  only  later,  if  at  all,  "lifted"  above  the  primitive 
representation  level.  There  is  also  a  strong  possibility  that 
future  extensions  of  HLR  might  be  best  expressed  in  primitive 
representations  even  in  an  implementation. 

The  basic  issue  with  respect  to  implementation  of  HLR  is  that, 
on  a  per  clause  basis,  primitive  representations  require  more 
storage.  In  addition,  in  the  early  part  of  the  search,  there  will 
be  many  more  clauses  than  if  a  non-primitive  representation  form  is 
used.  On  the  other  hand,  because  of  the  ability  to  disambiguate 
the  lock  numbering  at  the  primitive  representation  level,  we  see 
that,  as  formulated  in  this  report,  the  primitive  representation 
level  is  essentially  a  stronger  statement  of  the  basic  HLR 
strategy.  What  one  would  wish  to  have  is  some  form  of  dynamic 
assignment  of  lock  numbers  at  the  non-primitive  representation 
level  so  as  to  achieve  the  same  (or  perhaps  even  stronger)  degree 
of  singly  connectedness  as  is  available  at  the  primitive 


representation  level. 


Page  85 


3.6  Evaluation  of  the  HL-Resolution  Strategy 


The  HLR  strategy  as  stated  in  this  report  is  a  semantically 
oriented  resolution  strategy  which  attempts  to  achieve  search 
efficiency  by  extending  the  notion  of  a  clause  to  include 
information  about  the  derivation  history  of  the  clause.  This 
information  then  restricts  the  way  in  which  the  clause  may  further 
be  used,  so  as  to  constrain  the  search  at  the  general  level  to 
correspond  only  to  ground  refutations  of  a  very  restricted  form. 
This  ground  refutation  is  exactly  a  lock  refutation  on  the  ground 
set,  with  a  lock  numbering  meeting  some  additional  conditions 
relative  to  a  chosen  Herbrand  interpretation.  HLR  is  more  faithful 
to  its  ground  form  in  the  sense  that  an  HL-refutation  found  at  the 
general  level  will  always  correspond  (in  a  strong  sense)  to  some 
HL-refutation  of  some  unsatisf iable  ground  set.  This  is  not  the 
case  in  SR  (or  in  Ll-clash  resolution,  (Slagle,.  1972)),  nor  is  it 
the  case  in  TMS. 

HL-Resolution  was  seen  to  help  alleviate  both  the  term 
substitution  problem  and  the  local  interaction  problem,  but  still 
was  not  an  adequate  treatment  of  these  problems.  There  are 
indications  that  much  more  can  be  done  in  HLR  to  neutralize  these 
two  sources  of  search  combinatorics  . 


Judging  the  relative  merits  of  various  resolution  strategies 
is  often  very  difficult  to  do  with  any  certainty.  This  is 
particularly  the  case  when  the  basic  HL-Resolution  strategy  is 
involved  because  of  two  factors: 


3.6 


Page  86 


1.  There  is  no  full  implementation  of  the  basic  HL-ref inement 
yet  available. 

2.  There  are  two  additional  degrees  of  freedom  in  an 
HL-search,  namely  the  choice  of  model,  and  the  particular 
assignment  of  lock  numbers. 

Within  the  limitations  imposed  by  these  two  factors  the 
following  items,  at  least,  can  be  safely  stated. 

1.  For  a  sentential  clause  set  HLR  automatically  becomes 
normal  Lock  Resolution,  which  is  among  the  more  efficient  of 
the  known  resolution  refinement  strategies,  particularly  for 
(near  or  exactly)  minimally  unsatisfiable  sets. 

2.  When  the  model  is  chosen  to  be  all  negative  literals,  HLR 
specializes  to  a  refinement  of  PI  deduction.  Thus,  in  the 
worst  case  of  a  completely  trivial  model,  HLR  can  be  assumed  at 
least  superior  to  PI  deduction  (or  to  PI  deduction  under  some 
renaming) . 

3.  On  simple  problems  the  search  seems  to  grow  at  about  the 
same  rate  as  for  SL-resolution  (Kowalski  and  Kuehner,  1971). 
There  are  reasons  to  believe  that  HLR  will  gain  a  relative 
advantage  over  SL-resolution  both  as  the  clause  sets  become 
more  complex  (but  near  minimally  unsatisfiable),  and  as  clause 
sets  become  cluttered  with  irrelevant  clauses.  These  reasons 
have  to  do  with  the  (intuitively)  expected  characteristics  of 
FSL's  of  clauses  generated  at  the  deeper  levels  of  search.  As 
with  most  questions  of  resolution  search  efficiency,  these 
characteristics  seem  quite  resistant  to  theoretical  analysis. 


3.6 


Page  87 


'  We  mention  here  that  there  are  some  immediate  surface 

analogies  between  SL-resolution  (similar  also  to  Model  Elimination 
(Loveland,  1968,  1972))  and  HLR.  These  become  apparent  if 

connections  are  made  between  FSL  literals  and  framed  literals,  and 
ordered  clauses  and  lock  numbered  ordered  literals  'n  an  HL-clause. 
There  are  some  similarities,  but  the  differences  are  basic  enough 
to  raise  the  question  as  to  whether  HLR  could  be  advantageously 
extended  by  incorporating  some  analogue  of  the  A-  and  B-literal 
concepts  of  SL-resolution. 

4.  The  HLR  refinement,  while  performing  about  as  well  as  the 
best  general  purpose  complete  resolution  strategies  available 
on  simple  problems  and  expected  to  increase  in  relative 

•  advantage  on  harder  problems,  is  still  quite  inadequate  when 

V 

compared  with  what  it  would  seem  possible  to  do  on  theorem 
proving  tasks. 

5.  Thj  real  worth  of  the  HL-Resolution  refinement  lies  in  what 
seem  to  be  realizable  extensions  of  the  method.  Some  of  these 
are  given  in  the  next  section. 


3.7 


Page  88 


r..  ■ 


3.7  Extensions  of  the  HL-Resolution  Strategy 


This  section  enumerates  some  of  the  specific  directions  in 
which  it  is  thought  that  the  HL-Resolution  strategy  should  be 
extended . 


1.  HL-Resolution  could  produce  in  its  search  a  clause  in  which 
two  standard  literals  have  the  same  lock  numbers.  It  would 
seem  that  completeness  can  be  maintained  if  separate  new  lock 
numbers  are  then  assigned  (either  true  or  false  lock  numbers, 
whichever  is  involved  in  the  collision).  It  remains  to  be  seen 
if  such  a  collision  signals  any  identifiable  situation  which 
would  add  new  restrictions  to  the  search  process.  It  would  be 
interesting  to  try  to  develop  LR  into  an  even  more  restrictive 
strategy,  and  then  combine  that  strategy  with  TMS  to  obtain  a 
stronger  form  of  HLR.  Peterson  (Peterson,  1976)  has  shown  how 
LR  can  be  used  to  extend  strategies  which  are  complete  for  Horn 
sets  to  strategies  which  are  complete  in  general.  The  main 
result,  called  LNL-T  resolution,  is  a  lock  resolution  strategy 
which  is  not  directly  compatable  with  HLR,  but  it  would  seem 
that  there  would  be  similar  strategies  that  would  be 
compatable . 

2.  In  SL-resolution  framed  literals  act  as  a  derivation  tree 
history  marker  in  the  ordered  clause  in  which  they  occur.  This 
information  signals  the  specific  points  in  the  search  where  the 
search  deviates  from  input  resolution  constraints.  If  a 
similar  (i.e.  input)  type  of  refinement  could  be  grafted  into 
the  HLR  structure  the  expected  result  would  be  a  further 


Page  89 


increase  in  search  efficiency. 

3.  Subsumption  has  not  been  considered  yet  in  HLR.  The 
obvious  thing  to  try  to  do  is  to  develop  some  criteria  for 
semantic  subsumption  through  the  use  of  the  model.  This  seems 
difficult  to  do  if  it  is  required  that  completeness  be 
maintained . 

4.  The  presentation  of  the  basic  HLR  strategy  was  independent 
of  what  Herbrand  interpretation  is  used  as  the  model,  M.  Thus 
HLR  is  sound  and  complete  independent  of  the  choice  of  M. 
However,  it  is  assumed  that  part  of  the  TSP  will  depend  upon 
the  choice  of  the  model.  There  are  numerous  questions 
concerning  the  relationship  in  HLR  of  a  specific  M  to  a 
specific  set  of  clauses.  If  we  assume  M  is  a  model  scheme,  and 
thus  has  parameters  which  must  be  specified  before  the  search 
begins,  then  we  want  to  know  what  parameter  choices  are  best. 
At  present  very  little  is  known  about  how  to  compare  the  worth 
of  two  model  schemes  before  the  search  begins. 

5.  The  ultimate  sensitivity  of  HLR  to  the  exact  choice  of 
model  is  not  yet  known.  If  the  nature  of  the  model  proves  to 
be  crucial  in  search  efficiency  for  difficult  problems,  as  it 
would  seem  reasonable  to  assume  will  be  the  case,  then  it 
becomes  important  to  develop  methods  that  will  facilitate  the 
process  of  bringing  the  proper  model  to  bear  upon  a  search 
effort.  Such  methods  might  include  a  library  of  established 
model  schemes  and  procedures  for  deciding  which  model  to  use  on 
a  given  problem.  These  procedures  might  even  be  re-invoked 
after  some  partial  search  has  been  done  and  a  new  model 


Page  90 


selection  made  on  the  basis  of  the  partial  search  results. 

6.  It  would  be  highly  effective  if  a  way  were  found  to  combine 
several  models  into  a  new  model  having  more  desirable  features 
than  any  of  its  constituent  pieces.  This  seems  rather 
difficult  to  do  at  the  present  time.  Henschen  discusses  how  to 
combine  several  Herbrand  interpretations  into  a  new 
interpretation  for  the  case  of  Horn  sets  and  has  developed  a 
semantic  refinement  strategy  for  Horn  sets  (Henschen,  1975). 
It  remains  to  be  seen  if  A-models  can  be  combined  in  an 
analogous  manner  and  the  results  extended  to  non-Horn  cases. 

7.  HLR  is  sound  and  complete  with  an  arbitrary  Herbrand 
interpretation  as  the  model,  M.  There  are  procedures  that  are 
simple  extensions  to  the  basic  HLR  strategy  which  may  or  may 
not  affect  soundness  and  completeness,  depending  upon  the 
particular  clause  sets  they  are  applied  to.  It  is  important  to 
characterize  in  which  situations  this  occurs.  One  of  these 
procedures  we  call  evaluating  out  a  predicate  letter  (a  simple 
extension  of  "elimination  by  evaluating  predicates",  (Nilsson, 
1971,  p.218)).  Evaluating  out  a  given  literal  in  a  clause 
involves  simply  removing  the  literal  from  the  standard  literals 
of  a  clause  and  adding  it  to  the  FSL  set,  and  then  keeping  the 
transformed  clause  iff  it  is  feasible  in  M  (the  original  clause 
is  deleted).  Evaluating  out  a  predicate  letter  involves 
evaluating  out  every  standard  literal  in  a  clause  set  using 
that  predicate  letter.  Evaluating  out  a  predicate  letter  is 
not  a  soundness  preserving  operation,  in  general.  What  happens 
is  that  the  meaning  of  unsatisfiability  is  modified  from 


Page  91 


"unsatisf iable  means  false  in  every  interpretation"  to 
"unsati f iable  means  false  according  to  M" .  If  M  is  a  model 
scheme  then  it  contains  many  individual  interpretations,  and 
unsatifiable  will  mean  false  over  all  of  the  individual 
interpretations  in  M.  In  particular  we  notice  that  the 
modified  notion  of  unsatisfiability  pertains  only  to  the 
predicate  letter(s)  actually  evaluated  out.  We  see  here  what 
is  the  strongest  single  extension  available  in  the 
HL-Resolution  structure,  namely  the  ability  to  set  a  context, 
through  the  use  of  the  model,  for  theorem  proving.  The  process 
of  evaluating  out  predicate  letters  brings  model  based 
information  into  the  declaritive  clause  set  structure  in  a 
complexity  reducing  manner  (by  removing  standard  literals  and 
turning  them  into  FSL  liter  ;) ,  rather  than  by  complexity 
increasing  mechanisms  (such  as  adding  new  declarative 
information  in  the  form  of  additional  clauses).  The  process  of 
evaluating  out  literals  needs  to  be  better  understood. 

8.  The  LIP  needs  much  further  work.  The  obvious  choice  for  a 
carrier  of  global  information  in  the  HLR  search,  the  model,  is 
not  yet  adequately  transmitting  information  of  sufficient 
strength.  There  are  several  possible  ways  to  attack  the  LIP. 
All  seem  to  require  large  amounts  of  additional  computer 
resource  on  a  per  clause  generated  basis,  and  thus  would  only 
be  applicable  for  problems  with  large  search  spaces.  As  an 
example  of  such  an  attack  on  one  aspect  of  the  LIP,  consider 
the  following.  Let  S  be  an  unsatisf iable  set  of  general  level 
clauses,  and  C  a  clause  in  S.  We  say  that  the  clause  x 


Page  92 


properly  subsumes  the  clause  y  iff  x  subsumes  y  and  y  does  not 
subsume  x.  Then  in  searching  for  a  proof  one  need  never 
consider  C  to  represent  ground  instances  which  are  properly 
subsumed  by  any  other  clause  in  S.  In  HLR  all  of  the  literals 
of  C  survive  (as  either  standard  or  FSL  literals)  in  clauses 
derived  from  C.  As  clauses  derived  from  C  which  are  deeper  in 
the  search  are  produced,  it  may  be  possible  to  detect  at  some 
point  that  a  resolvent  is  in  fact  using  only  ground  instances 
of  C  which  can  be  subsumed  by  some  other  input  clause  (or 
clauses).  If  such  a  situation  occurs,  then  the  resolvent  may 
be  deleted.  Such  a  deletion  strategy  is  a  complete  refinement 
of  HLR.  There  are  also  some  variants  of  this  strategy  which 
are  not  known  to  be  complete  (nor  known  to  be  incomplete) 
refinements  of  HLR.  An  important  point  to  be  emphasized  here 
is  that  the  above  example  is  totally  syntactic  in  its 
orientation  (assuming  that  subsumption  is  defined 
syntactically).  One  is  led  to  wonder  if  there  are  analogous 
semantic  refinements. 


4.0 


Page  93 


c* 


4 . 0  Summary 

A  new  resolution  strategy,  called  HL-Resolution ,  has  been 
presented  as  a  sound  and  complete  refinement  strategy  combining 
Lock  Resolution  and  The  Model  Strategy.  HL-Resolution  combines 
both  semantic  and  syntactic  refinement  criteria,  and  in  particular 
tries  to  achieve  an  increase  in  the  globality  (or  context)  of  the 
individual  resolution  steps  (through  the  use  of  FSL  sets).  A 
simple  example  was  given  using  HLR  indicating  that  it  was  an 
efficient  strategy.  The  basic  HLR  strategy  was  judged  comparable 
to  such  strategies  as  SL-resolution  for  simple  problems,  but  would 
be  expected  to  have  a  relative  advantage  on  more  complex  problems. 
The  primary  utility  of  HLR  is  that  it  offers  a  suggestive  framework 
for  the  development  of  further  strategies,  particularly  through  the 
use  of  models. 

The  models  useful  in  HL-Resolution  were  described  as  model 
schemes  which  initially  are  specified  at  a  certain  level  of 
generality,  and  become  more  specified  as  the  resolution  search 
proceeds.  This  is  done  in  a  way  which  allows  a  deferment  of  the 
decision  as  to  exactly  what  Herbrand  interpretation  the  model 
represents  until  more  information  is  available  from  the  search 
process . 

Considerably  more  work  remains  to  be  done  both  in  formalizing 
the  results  already  obtained  concerning  models,  and  in  extending 
the  basic  HL-Resolution  strategy. 


references 


Page  94 


REFERENCES 

1.  Boyer,  R.,  Locking :  A  Restriction  of  Resolution ,  Ph.  D. 
Thesis,  University  Microfilms  International,  Ann  Arbor, 
Michigan  ,  1971. 

2.  Chang,  C.  L.  and  Lee,  R.  T.  C.,  Symbol ic  Logic  and 
Mechanical  Theorem  Proving ,  Academic  Press,  New  York,  1973. 

3.  Cooper,  D.  C.,  "Theorem  Proving  in  Arithmetic  without 
Multiplication",  in  Machine  Intelligence  7,  (eds.  Meltzer,  B. 
and  Michie,  D.),  Edinburgh  University  Press,  Edinburgh,  1972, 
pp.  91-99. 

4.  Henschen,  L.  J.f  "Semantic  Resolution  for  Horn  Sets", 
Advance  Papers  of  the  Fourth  International  Joint  Conference  on 
Artificial  Intelligence,  vol .  1,  1975,  pp.  46-52. 


5. 

Kowalski,  R.  and 

Kuehner,  D.,  "Linear  Resolution 

with 

Selection  Function", 

Artificial  Intelligence, 

2, 

1971  , 

pp. 

227-260. 

6. 

Loveland,  D.  W., 

"Mechanical  Theorem-Proving 

by 

Model 

Elimination",  J.  ACM, 

15,  2,  1968,  pp.  236-251. 

7. 

Loveland ,  D.  W. , 

"A  Unifying  View  of  Some  Linear 

He 

rbrand 

Procedures",  J.  ACM,  19,  2,  1972,  pp.  366-384. 

8.  Luckham,  D.  "Refinement  Theorems  in  Resolution  Theory", 
Proc .  IRIA  Symp .  Automatic  Demonstration ,  Versailles,  France, 
1968,  Spr inger-Verl ag ,  New  York,  1970,  pp.  163-190. 


references 


Page  95 


9.  Moore,  R.  C.,  Reasoning  From  Incomplete  Knowledge  in  a 
Procedural  Deduction  System ,  AI-TR-347,  Artificial  Intelligence 
Laboratory,  Massachusetts  Institute  of  Technology,  1975. 

10.  Nilsson,  N.  J.,  Problem  Solving  Methods  in  Artificial 
Intelligence ,  McGraw-Hill,  New  York,  1971. 

11.  Peterson,  G.  E.,  "Theorem  Proving  with  Lemmas",  J.  ACM, 
23,  4,  1976,  pp.  573-581. 

12.  Robinson,  J.  A.,  "A  Machine-Oriented  Logic  Based  on  the 
Resolution  Principle",  J.  ACM,  12,  1,  1965,  pp.  23-41. 

13.  Slagle,  J.  R.,  "Automatic  Theorem  Proving  With  Renamable 
and  Semantic  Resolution",  J.  ACM,  14,  4,  1967,  pp.  687-697. 

14.  Slagle,  J.  R.,  "Automatic  Theorem  Proving  with  Built-in 
Theories  Including  Equality,  Partial  Ordering,  and  Sets",  J. 
ACM,  19,  1,  1972,  pp.  120-135. 

15.  Wos,  L.,  Robinson,  G.  A.,  Carson,  D.  F.  and  Shalla,  L., 
"The  Concept  of  Demodulation  in  Theorem  Proving",  J.  ACM,  14, 


4,  1967,  pp.  698-709. 


appendix 


Page  96 


APPENDIX 


To  help  the  reader  check  on  his  understanding  of  the 
definitions  used  in  sections  3.4  and  3.5,  an  example  is  given  here. 
The  problem  statement  describes  the  following  world,  and  is  taken 
with  minor  changes  from  Moore  (Moore,  1975). 

Three  blocks,  B1,  B2,  and  B3  are  stacked  with  B1  on  the 
top  and  B3  on  the  bottom  and  B2  in  the  middle.  B1  is  blue  in 
color,  and  B3  is  green.  It  is  not  known  if  B2  is  blue  or 
green,  but  it  is  one  or  the  other. 

The  problem  task  is  to  show  that  in  this  world  there  are  two 

blocks,  one  on  the  other,  and  the  upper  one  is  blue  and  the  lower 

/ 

one  is  green.  The  set  of  clauses  this  translates  into  will  be 
called  "Colored  Blocks".  Colored  Blocks  is  the  initial  clause  set, 
S. 


S:  Colored  Blocks 


51.  ON ( B1 , B2 ) ; 

52 .  ON(B2,B3); 

53.  COLOR ( B 1 , blue) ; 

54.  C0L0RCB3, green) ; 

55 .  COLOR ( B2 , bl ue ) ,  COLOR ( B2 , green ) ; 

56.  @0N ( x , y ) ,@COLOR ( x ,blue) ,@COLOR(y , green) ; 


appendix 


Page  97 


The  predicate  and  constant  names  are  self  explanatory. 
Clause  S6  is  the  denial  of  the  existence  of  the  situation  that  we 
are  going  to  show  actually  does  hold,  while  the  first  5  clauses  are 
just  the  facts  we  do  know  to  be  true. 

We  pick  some  arbitrary  HL-proper  lock  numbering  for  the 
HL-associated  input  set  of  Colored  Blocks. 

HL-associated  Input  Set  of  Colored  Blocks 


1 . 

<1 , ON ( B 1 ,B2) , 100>; 

FSL  = 

[] 

2. 

<2,ON(B2,B3) , 200> ; 

FSL  = 

[] 

3. 

<  3 , COLOR ( B1 ,blue) ,300>; 

FSL  = 

n 

4. 

<4 .COLOR (B3, green) ,400>; 

FSL  = 

[] 

5.  <5,COLOR(B2,blue) ,500>,  <6 , COLOR ( B2 , green ), 600> ;  FSL  =  [] 

6.  <7 ,@ON(x ,y) ,700>,  <8 , @COLOR ( x .blue ) , 800> , 

<9,@C0L0R(y,green)  ,900>;  FSL  =  [] 

Now,  in  order  to  form  the  exact  primitive  representation 
HL-associated  input  set  for  Colored  Blocks,  it  is  necessary  to  have 
a  particular  model  specified.  We  pick  as  the  domain  of  the  model 
the  set  of  real  numbers,  and  use  only  the  usual  equality  relation 
within  this  model.  We  pick  some  specific  individual  real  numbers 
to  represent  the  constants  in  LC.  Specifically  we  have  the 
following  correspondence  from  LC  to  LM. 


appendix 


Page  98 


ON  =>  = 

COLOR  =>  = 

B1  =>  1 
B2  =>  2 
B3  =>  3 
blue  =>  1 
green  =>  3 

This  gives  us  the  following  exact  primitive  representation 
HL-associated  input  set  for  Colored  Blocks,  with  respect  to  M. 


Exact  Primitive  Representation 
HL-Associated  Input  Set  for  Colored  Blocks 

Here  we  underline  the  active  lock  number  for  each  literal,  and 
also  underline  the  second  component  of  any  selected  literals  in 
each  clause.  The  particular  lock  numbering  we  have  chosen  gives 
just  one  selected  literal  per  clause.  For  ease  in  reading,  the 
lock  numbers  are  omitted  for  the  FSL  literals. 


1 . 

<1 , ON ( B1 ,B2) ,100>; 

FSL  = 

[ ON ( B1 , B2) ] 

F 

2. 

<2,0N(B2, B3) ,200>; 

FSL  = 

[ON ( B2 , B3 ) ] 

F 

3. 

<  3 , COLOR ( B1 , blue ) ,300>; 

FSL  = 

[@C0L0R(B1 ,blue) ] 

T 

4. 

< 4, COLOR (B3, green) ,400> 

;  FSL  = 

[ @ COLOR (B3, green)] 

T 

5. 

<5 , COLOR ( B2 , blue ) ,500>, 

< 6, COLOR (B2, green) ,600>; 

FSL  =  [C0L0R(B2, blue) ,C0L0R(B2, green) ]  F 


appendix 


Page  99 


6.1.  <7,@0N(x,y) ,700>,  <8 , 0COLOR ( x .blue)  ,  800>  , 

» 

<9 ,@COLOR(y , green) , 90  0  > ; 
FSL  =  C  ON ( x , y) , COLOR ( x ,blue) , COLOR ( y , green ) ]  T 

6.2.  <7,e£N(jr!j)_,700>,  <8 , @COLOR ( x , blue) ,800> , 

<9,@C0L0R(y,green) ,900>; 
FSL  =  [ON(x,y) ,@C0L0R(x, blue) ,@COLOR(y, green)]  T 

6.3.  <7,@ON(x,y)  ,700>,  <8 ,gCOLOR(x ,blue) ,800>, 

<9 , @COLOR(y , green)  ,900>; 
FSL  =  [@ON(x,y) ,COLOR(x, blue) ,@COLOR(y, green)]  T 

6.4.  <7 ,§ON(x ,y) ,700>,  <8 , ©COLOR ( x , blue ) , 800> , 

<9 ,0COLOR(y , green) ,900>; 
FSL  =  [ON(x,y) ,COLOR(x, blue) ,@COLOR(y, green)]  T 

6.5.  <7,@ON(x,y)  ,700>,  <8 , gCOLOR ( x , blue) ,800>, 

<9 , § COLOR (y , green) ,900>; 
FSL  =  t@0N(x,y) ,§COLOR(x, blue) ,COLOR(y, green)]  T 

6.6.  <7,@0N(jt_Ly2f700>,  <8  ,@COLOR(x,blue)  ,800>, 

<9 ,@COLOR(y, green) ,900>; 
FSL  =  [0N(x,y) ,@C0L0R(x, blue) ,COLOR(y, green)]  T 

6.7.  <7,@0N(x,y) ,700>,  <8 , gCOLOR ( x ,blue ) ,800>, 

<9 , @C0L0R(y , green) , 90  0  > ; 
FSL  =  [@0N(x,y) ,C0L0R(x, blue) ,COLOR(y, green)]  T 


appendix 


Page  100 


We  disambiguate  the  numbering  to  give  us  the  unambiguously 
HL-proper  lock  numbered  exact  primitive  representation 
HL-associated  input  set  for  Colored  Blocks.  This  corresponds  to 
HLS  of  the  completeness  theorem  in  section  3-5. 


HLS :  Exact  Primitive  Representation 

HL-Associated  Input  Set  of  Colored  Blocks  (disambiguated ) 


HLS  1  . 

<1 , ON ( B 1 ,B2) , 1 00  > ; 

FSL  =  [ ON ( Bl , B2 ) ] 

F 

HLS2 . 

<2 , ON ( B2 , B3 ) , 200> ; 

FSL  =  [ ON ( B2 , B3 ) ] 

F 

HLS3. 

<  3 , COLOR ( Bl .blue) ,300>; 

FSL  =  [ @C0L0R ( Bl .blue) ] 

T 

HLS4. 

<4, COLOR (B3, green) ,400>; 

FSL  =  [§COLOR(B3, green)] 

T 

HLS5 . 

<5, COLOR (B2, blue) ,500>, 

<6, COLOR (B2, green) ,600>; 

FSL  =  [C0L0R(B2, blue) ,C0L0R(B2, green) ]  F 


HLS6.1.  <7 i @0N ( x  ,  y )  ,700>,  <8 , @C0L0R ( x  , blue ) , 800>  , 

<9 ,@C0L0R (y , green) ,900>; 
FSL  =  [ON ( x ,y ), COLOR ( x .blue) , COLOR ( y , green) ]  T 

HLS6.2.  <10, @0N ( x , y ) ,701>,  < 11 , €COLOR ( x , bl ue ) , 702> , 

< 1 2 , @ COLOR ( y ,green ) ,703>; 
FSL  =  [0N(x,y) ,@C0L0R(x, blue) ,@C0L0R(y, green)]  T 

HLS6.3.  <13,@0N(x,y)  ,704>,  < 1 4 , gCOLOR ( x , bl ue ) ,705>, 

<15, @C0L0R(y, green)  ,706> ; 
FSL  =  [§0N(x,y) ,C0L0R(x, blue) ,@C0L0R(y,  green)]  T 

HLS6.4.  <  1  6 , €0N ( x , y )  ,707>,  < U , 0COLOR ( x , bl ue ) , 708> , 

<18, §C0L0R(y, green) ,709>; 
FSL  =  [ON ( x , y) , COLOR ( x ,blue) ,@C0L0R ( y , green ) ]  T 


appendix 


Page  101 


HLS6.5.  <19,0ON(x,y)  ,jM0>,  <20,gC0 LOR (x, blue)  ,711>, 

<21 ,eC0L0R(y, green) ,712>; 
FSL  =  [ @0N ( x ,y) ,@C0L0R ( x .blue) , COLOR ( y .green ) ]  T 

HLS6.6.  <22  ,  gQN  (  x  ,  y )  ,7 1  3>  ,  <23  ,  0COLOR  (  x  .blue)  ,714>, 

<24. eCQL0R(y, green) , 7 1 5> ; 
FSL  =  CON(x.y) ,@C0L0R(x, blue) ,C0L0R(y, green)]  T 

HLS6.7.  <25,@0N(x,y)  ,7_1_6>  ,  <26,eC0L0R(x,blue)  ,7 1 7> , 

<27, @C0L0R(y, green) ,718>; 
FSL  =  [@ON(x,y) ,COLOR(x, blue) ,COLOR(y, green)]  T 


SG  for  Colored  Blocks 

Using  Colored  Blocks  as  S,  we  have  an  SG  in  which  clauses  1 
through  5  are  just  as  they  are  in  Colored  Blocks,  and  clause  6  of 
Colored  Blocks  yields  two  ground  instances. 

SGI.  ON ( B1 , B2 ) ; 

SG2.  ON ( B2 , B3 ) ; 

SG3.  C0L0RCB1 .blue) ; 

SG4.  COLOR ( B3 , green ) ; 

SG5.  COLOR ( B2 , blue ) ,  COLOR ( B2 , green ) ; 

SG6 .  @0N(B1 ,B2) ,@C0L0R(B1 .blue) ,@C0L0R (B2, green) ; 

SG7.  @0N(B2, B3) ,§ COLOR ( B2, blue) , gCOLOR ( B3 , green ) ; 


appendix 


Page  102 


GR(S,SG)  for  Colored  Blocks 

A  GR(S,SG)  that  connects  this  S  and  SG  is: 

GR ( S , SG )  =  [ (null) , (null) , (null) , (null) ,  (null)  , 

((B1/x,B2/y) ,(B2/x,B3/y> )] 

which  is  a  list  of  6  elements.  The  first  5  elements  are  identical, 
and  consist  each  of  a  single  substitution  which  we  call  null.  Null 
causes  no  substitutions  to  occur  on  the  correspond ing  clause  but 
does  cause  one  copy  of  that  clause  to  be  in  SG.  The  sixth  element 
of  GR(S,SG)  consists  of  two  distinct  substitutions,  generating  the 
two  ground  instances  (clauses  SG6  and  SG7)  in  SG  that  correspond  to 
clause  S6  of  Colored  Blocks. 


HLSG  for  Colored  Blocks 

The  set  HLSG  for  Colored  Blocks  has  its  first  5  clauses  the 
same  as  the  first  5  clauses  in  HLS,  which  is  the  exact  primitive 
representation  HL-associated  input  set  for  Colored  Blocks 
(disambiguated).  The  reason  is  that  the  first  5  elements  of 
GR(3,SG)  each  consists  of  just  one  substitution,  namely  the  null 
substitution.  Clauses  6  and  7  of  HLSG  result  from  applying  each  of 
the  elements  of  the  sixth  component  of  GR(S,SG)  to  each  clause 
among  HLS6.1  through  HLS6.7.  This  gives  one  feasible  ground  clause 
from  HLS6.6,  which  becomes  clause  6  of  HLSG,  and  one  feasible 
ground  clause  from  clause  HLS6.4,  which  becomes  clause  7  of  HLSG. 


appendix 


Page  103 


1 . 

<1  ,  ON  (  B1  ,B2)  ,'1 00 >  ; 

FSL  = 

t  ON ( B1 , B2 ) ] 

F 

2. 

<2 , ON ( B2 , B3 ) , 200> ; 

FSL  = 

[ON ( B2 , B3 ) ] 

F 

3. 

<3 , COLOR ( B1 ,blue) ,300>; 

FSL  = 

[ gCOLOR ( B1 .blue)] 

T 

4. 

<4, COLOR (B3, green) ,400>; 

FSL  = 

[§C0L0R(B3, green) ] 

T 

5. 

<5 , COLOR ( B2 , blue ) ,500>, 

< 6, COLOR (B2, green) ,600>; 

FSL  =  [COLORCB2, blue) ,C0L0R(B2, green)]  F 

6.  <22,gQN(B1  ,B2)  ,  7 1  3>  ,  <23 .  gCOLOR  ( B1  ,  blue )  ,7J_4>  , 

<24 , @ COLOR ( B2 , green) , 7 1 5> ; 

FSL  =  [0N(B1 ,B2) ,§COLOR(B1 , blue) ,C0L0R(B2, green)] 

7.  <l6,gQN(B2,B3)  ,707>,  <VT  ,gC0L0R  ( B2 ,  blue )  ,  708>  , 

<18, @C0L0R(B3, green)  ,709>; 

FSL  =  [ON(B2,B3) ,C0L0R(B2, blue) ,@C0L0R(B3, green)]  T 

SGL  for  Colored  Blocks 

Finally  this  yields  an  SGL  for  Colored  Blocks  as  follows: 

1.  <0N(B1 ,B2) , 100) ; 

2.  <ON(B2,B3) ,200); 

3.  <C0LQR(B1 ,blue) ,3>; 

4.  < COLOR (B 3, green) ,4>;  f 

5.  <C0L0R ( B2 , blue) ,500>,  <C0L0R ( B2 , green ), 600> ; 

6.  <gQN(B1 ,B2) ,22>,  <gC0L0R ( B1 , blue ) ,71 4> ,  <§C0L0R ( B2 , green ), 24 > ; 

7.  <  gQN ( B2 , B3 ) , 1 6> ,  <g COLOR (B2,blue) ,17>,  <gC0L0R ( B3 , green ), 709> ; 


appendix 


Page  104 


R-SGL  for  Colored  Blocks 


6x1=8.  <gC0L0R(B1 .blue) ,714>,  <gC0L0R ( B2 , green )  ,24>; 
7x2  =  9.  < gCOLOR ( B2 , blue ) , 17>,  <@C0L0R ( B3 , green ), 709> ; 

9x5=10.  <C0L0R(B2, green) ,600>,  < gCOLOR ( B3 , green ), 709> ; 
10x8=11.  <gC0L0R(B3, green) ,709>,  < gCOLOR ( B1 , blue ) , 7 1 4 > ; 
11x4  =  12.  <gCQL0R(B1 .blue)  , 7 1 4 > ; 


12x3  =  13-  *  BOX* ; 


R-HLSG  for  Colored  Blocks 

Here  the  FSL  literals  will  be  written  with  all 
components  to  make  it  easier  to  see  where  they  came  from. 


6x1=8.  <23,@COLOR(B1  .blue)  ,7J_4>,  <24 , gCOLOR ( B2 , green )  .715> 

FSL  =  [<22,ON(B1 ,B2) ,71 3>,  <2 3 , gCOLOR ( B1 , bl ue )  , 7 1 4> , 
<24,COLOR(B2, green) ,71 5>,  < 1 , ON ( Bl , B2 ) , 1 00> ] 

7x2=9.  < 1 7, gCOLOR ( B2 , blue) ,708>,  < 1 8 , gCOLOR ( B3 , green ) ,709> 

FSL  =  [<16,0N(B2,B3) ,707>,  < 1 7 , COLOR ( B2 , bl ue )  ,  708>  , 

< 1 8, gCOLOR (B3, green) ,709>,  <2 , ON ( B2 , B3 ) , 200> ] 

9x5  =  10.  <6, COLOR (B2, green)  ,600>,  < 1 8 , gCOLOR ( B3 , green )  , 709 > ; 

FSL  =  [<16,0N(B2,B3) ,707>,  < 1 7 , COLOR ( B2 , bl ue ) , 708> , 

< 18, gCOLOR (B3, green)  ,709>,  <2,0N(B2,B3) ,200>, 

< 5, COLOR (B2, blue) ,500>,  <6 , COLOR ( B2 , green ) ,600> 


three 


T 


T 


1  F 


appendix 


Pap,e  105 


10x8=11.  <  1  8  ,  g  COLOR  (  B3  ,  green )  ,709>,  <23 ,@C0L0R ( Bl  .blue) ,  7J_4>  ; 

FSL  =  [<16,0N(B2,B3> ,707>,  < 1 7 , COLOR ( B2 , bl ue )  , 708>  , 

< 1 8 , 0 COLOR ( B3 , green) ,709>,  <2,ON(B2,B3), 200 > , 

<5 , COLOR (B2, blue) ,500>,  <6 , COLOR ( B2 , green ) ,600>, 
<22 , ON ( B 1 ,B2) ,71  3>,  <2 3 , 0COLOR ( Bl , bl ue ) ,714>, 

<24 , COLOR ( B2 , green ) , 7 1 5> ,  <1 ,0N(BT ,B2) , 100>]  F 

11x4  =  12.  <2  3,  9C0L0R ( Bl .blue) ,714>; 

FSL  =  [<16,0N(B2,B3) ,707>,  < 1 7 , COLOR ( B2 , bl ue ) , 708> , 

< 1 8 ,  @ COLOR (B3, green) ,709>,  <2,ON(B2,R3) ,200>, 

<5 , COLOR ( B2 , blue) ,500>,  <6 , COLOR ( B2 , green ) ,600>, 
<22,0N(B1 ,B2) , 7 1 3> ,  <23 , §COLOR ( Bl .blue) ,71 4>, 

<24 , COLOR (B2, green) ,71 5>,  < 1 , ON ( Bl , B2 ) , 1 00> 

<4 ,@C0L0R(B3, green)  ,400>]  F 

12x3=13.  *  BOX* ; 

FSL  =  [<l6f0H(B2,B3),707>,  < 1 7 , COLOR ( B2 , blue ), 708>  , 

<18 , @ COLOR (B3, green) ,709>,  <2,ON(B2,B3) ,200>, 

<5 .COLOR (B2, blue)  ,500>,  <6 , COLOR ( B2 , green )  ,600>, 
<22,0N(B1  ,B2)  ,71  3>,  <2  3,,  @  COLOR  (  Bl  .blue)  ,71  4>, 
<24,C0L0R(B2, green) ,715>,  < 1 , ON ( Bl , B2 ) , 1 00> , 

<4 , @ COLOR ( B3 , green ) ,400>,  <3,@C0L0R(B1 .blue)  ,300>] 


R-HLS  for  Colored  Blocks 

The  refutation  for  HLS  is  identical  (in  this  particular 
example)  to  R-HLSG  except  that  clause  6  of  HLSG  is  replaced  by 
clause  HLS6.6,  and  clause  7  of  HLSG  is  replaced  by  clause  HLS6.4. 


index 


P  a  g  e  106 


IN_DFX 

Items  in  this  index  are  only  those  which  are  defined  in  the 
text,  and  the  page  references  are  only  to  the  definitions.  For 
abbreviations  not  listed  here  see  the  table  following  this  index. 


=  >  3 

Against  (resolving  against)  30 
Algorithmic  based  models  (A-models)  13 

Ambiguous  lock  numbering  (see  disambiguated  and  unambiguous) 

Basic  HL-Resolution  strategy  53 

Basic  primitive  representation  deduction  61 

Basic  primitive  representation  refutation  61 

Disambiguated  62 

Elementary  factoring  step  35 
Evaluating  out  (a  predicate  letter)  90 

Evaluation  function  (for  a  Herbrand  interpretation)  56,  58 
Exact  primitive  representation  57 

Exact  primitive  representation  HL-associated  input  set  62 
Exact  primitive  representation  set  57 

f  (selector  function)  54 
F  29,  38 

Factoring  unifier  60 

False  lock  number  28,  32,  54 

False  parent  61 

False  substitution  literals  27,  55 
Feasible  clause  56 
FSL  27,  55 

Ground  case  model  12 

Ground  HL-clause  55 

Ground  literal  55 

Ground  literal  truth  value  55 

Ground  normal  lock  function  70 

Ground  normal  lock  image  70 

Grounding  substitution  (for  a  literal)  55 

Grounding  substitution  (for  a  set  of  literals)  26,  55 

Grounding  substitution  (for  an  HL-clause)  55 

Herbrand  interpretation  53 
HL-associated  input  set  62 
HL-clause  27,  55 
HL-literal  54 
ML- null  clause  56 
HL-proper  lock  numbering  54 

Immediate  primitive  representation  deduction  70 
Immediately  deducible  70 
Inexact  primitive  representation  57 
Infeasible  56 


index 


Page  107 


Influential  variable  28 
Influential  literal  28 

Lemma  I  67 

Lemma  II  68 

Lemma  III  71 

Lemma  IV  72 

Lemma  V  83 

Lock  numbering  54 

Lock  Resolution  5 

Local  interaction  problem  49 

LSLE  38 

M  (as  a  function)  56,  58 
MLR  (Modified  Lock  Resolution)  76 
Minimally  unsatisf iable  2 
Model  evaluation  function  56,  58 
Modified  lock  resolution  76 

Naturally  associated  68,  73-76,  80 
Negation  (of  a  literal)  28 
Normal  resolution  3 

Primitive  representation  57 

Primitive  representation  binary  resolvent  60 

Reduced  selected  factor  60 
Resolution  step  2 

Satisfy  (a  set  of  literals)  26 
SD  27,  55 

Selected  factor  59,  60 
Selected  false  literal  59 
Selected  literal  59 
Selected  true  literal  58 

Simultaneous  linear  equation  (SLE)  model  22 
Singly  connected  3 
Standard  literals  27,  55 
Strict  grounding  68 

t  (selector  function)  54 

T  29,  38 

T/F  29,  38 

The  Model  Strategy  7 

Theorem  (soundness  and  completeness  of  HLR)  63 
Term  substitution  problem  47ff. 

Trivial  model  3,  12 

True  clause  27 

True  lock  number  28,  32,  54 

True  parent  61 

Unambiguous  (lock  numbering)  54 


table 


Page  108 


TABLE  OF  ABBREVIATIONS 


CNF 

.FA. 

HI 

HLR 

LC  (LM) 

LIP 

LR 

M 


SC 

SR 

.TE. 

TMS 

TSP 

•BOX* 

fc 


Conjunctive  Normal  Form 
The  quantifier  "for  all" 
Herbrand  interpretation 
Hereditary-Lock  Resolution 
Language  of  the 
clauses  (of  the  model) 

Local  Interaction  Problem 
Lock  Resolution 
Model  (usually  meaning  a 
collection  of  Herbrand 
interpretations) ,  or  a  model 
evaluation  function 
Sentential  Calculus 
Semantic  Resolution 
The  quantifier  "there  exists" 
The  Model  Strategy 
Term  Substitution  Problem 
The  empty  list  (or  set) 
of  literals 

subtraction  or  set  difference 


§ 


negation  sign 


SOSAP-TR-32 


June  1977 


FORMAL  SPECIFICATIONS  OF  MODELS  FOR  SEMANTIC  THEOREM  PROVING  STRATEGIES 
D.  M.  Sandford 


Department  of  Computer  Science 

Hill  Center  for  the  Mathematical  Sciences 

Busch  Campus 

Rutgers  University 

New  Brunswick,  New  Jersey 


research  was  supported  by  the  Advanced  Research  Projects  Agency 
Of  the  Department  of  Defense  under  Grant  #DAHC15-73-G6  to  the 
Rutgers  Project  on  Secure  Systems  and  Automatic  Programming 

The  views  and  conclusions  contained  in  this  document  are  those  of  the 
author  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  implied,  of  the  Advanced 
Research  Projects  Agency  or  the  U.  S.  Government. 


ACKNOWLEDGMENTS 

I  would  like  to  thank  Professor  A.  Yasuhara  for  a  careful 
proofreading  and  suggestions  on  improving  the  clarity  of 
presentation . 

This  work  was  supported  in  part  by  the  Advanced  Research 
Projects  Agency  of  the  Office  of  the  Secretary  of  Defense  under 
grant  DAHCI3-73-G6 . 


CONTENTS 


Chapter  1 

1.0  Introduction . 

1.1  Models  in  General . 

1.2  MDS . 

1.3  Model  Schemes  Instead  of  Models 


Chapter  2 


2.0  Introduction . 

2.1  General  Definitions  and  Nomenclature... 

2.2  Basic  Model  Scheme  Nomenclature . 

2.3  Basic  Model  Scheme  Results . 

2.4  Basic  Model  Scheme  Specification . 

2.5  A  Sound  and  Complete  Refinement 

Using  Incomplete  Model  Schemes . 

2.5  An  Example  of  a  3asic  Modeling  Structur 


Chapter  3 

3.0  Discussion 

REFERENCES . 


TA3LE  OF  ABBREVIATIONS 


1.0  Introduction 


This  report  is  a  companion  report  to  30SAP-TR-33  (Sandford, 
1977,  hereafter  TR-30) ,  and  is  a  more  formal  treatment  of  material 
introduced  informally  in  Chapter  2  of  TR-30.  This  report  is 
theoretically  oriented,  and  constitutes  the  initial  presentation  of 
material  which  will  be  explored  in  greater  depth  in  a  thesis  by  the 
author  (forthcoming  in  1978). 


1.1  Models  in  General 


i  <r 

! 

I 

i 

i 


There  is  considerable  interest  in,  and  belief  in  the  utility 
of,  using  models  to  incorporate  semantic  information  into  the 
processing  repertoire  of  sophisticated  problem  solving  systems. 
Models  seem  to  be  important  in  controlling  search  space  size  when 
the  search  space  is  syntactically  defined.  One  of  the  most 
important  situations  of  this  type  is  in  theorem  proving  in  a 
specific  domain  of  discourse  using  a  general  purpose  theorem 
prover.  In  such  a  situation  it  is  hoped  that  the  model  will  supply 
the  needed  search  guidance  to  make  the  general  purpose  inference 
rules  efficient  in  the  specific  domain  of  application.  An  example 
of  such  a  specific  domain  is  that  of  program  verification.  In  this 
specific  case  the  theorem  proving  problems  which  are  generated  as 
subproblems  are  in  general  well  beyond  the  ability  of  currently 
available  syntactically  oriented  theorem  provers.  One  approach  to 
solving  this  difficulty  is  to  employ  semantic  information  to  guide 
the  theorem  proving  search  by  using  models.  In  order  to  do  this 


one  must  be  able  both  to  specify  and  manipulate  models,  and  be  able 


Page  2 


to  relate  the  model  effectively  to  the  syntactic  search.  TP-3'J  was 
principally  concerned  with  presenting  a  new  resolution  refinmer.t 
strategy  which  uses  model  based  information  in  it's  search.  This 
report  is  concerned  primarily  with  a  paradigm  for  firstly, 
specifying  models,  and  secondly,  (at  an  abstract  level)  utilizing 
models  so  that  their  semantic  information  is  available  to  a  model 
based  guidance  mechanism.  This  material  is  presented  in  terms  of  a 
first  order  resolution  refutation  situation  (Robinson,  1965),  but 
the  framework  seems  to  be  general  enough  so  as  to  have  application 
in  search  space  situations  which  are  phrased  in  other  notations. 

The  principal  result  obtained  in  this  research  and  presented 
here  is  that  sound  and  complete  syntactic  resolution  theorem 
proving  search  procedures  can  be  guided  by  models  which  are 
themselves  incomplete,  just  so  long  as  the  models  are  internally 
consistent . 


model  is  not  a  well  defined  notion  unless  we  also  consider 
what  it  is  a  model  tf .  The  notion  of  model  we  wish  to  emphasize  is 
that  a  model  is  a  structure  that  abstracts  the  relevant  structure 
of  what  it  models,  and  organizes  that  abstracted  relevant  structure 
in  a  manner  that  allows  efficient  problem  solving  on  the 
abstraction  (i.e.  efficiency  in  the  model,  not  necessarily 
efficiency  in  the  original  structure). 

When  considering  problem  solving  activity  as  an  automated 
process  there  is  a  constant  source  of  difficulty  which  appears  due 
to  the  present  state  of  development  of  the  field  of  artificial 
intel 1 ig 3nce .  This  difficulty  is  that,  while  we  wish  to  have 


Page  3 


information  processing  performed  which  is  of  a  given  level  of 
complexity  when  understood  semantically,  we  are  forced  to  find 
fully  declarative  syntactic  representations  of  these  tasks  in  order 
to  express  them  to  a  computer.  This  mapping  to  syntactic 
representations  is  currently  unavoidable,  but  what  one  hopes  is 
avoidable  is  the  concurrent  loss  of  semantic  information  which 
accompanies  this  translation  as  accomplished  by  current 
representation  techniques. 

Model  use  (for  the  purposes  of  this  report)  is  an  approach 
which  tries  to  obtain  heuristic  efficiency  in  a  search  process  by 
making  the  declarative  syntactic  search  process  be  guided  by 
semantic  information.  Thus  the  efficiency  of  a  search  process  is  a 
function  of  the  model,  but  the  legitimacy  of  the  resulting  search 
is  totally  a  function  of  the  syntax  of  the  search  space  language 
and  the  syntactic  rules  of  manipulation ,  and  is  thus  not  affected 
by  the  semantic  content  of  the  model. 


There  are  several  different  ways  in  which  models  may  be 
related  to  the  declarative  information  processing  task.  One  of 
these  is  in  top  down  or  hypothesis  oriented  use  of  the  model  to 
govern  how  the  search  space  is  explored.  In  this  approach  the  task 
to  be  accomplished  is  performed  first  in  the  model,  and  then  the 
model  solution  is  projected  onto  the  syntactic  search  space  to  see 
if  the  model  solution  is  feasible  syntactically.  Thus  the 
syntactic  search  space  is  expanded  only  under  the  guidance  of  a 
relatively  complete  solution  scheme  proposed  by  the  model.  A 
converse  way  to  use  models  is  in  a  bottom  up  syn tactic  data  driven 


1 . 1 


Page  I 


manner ,  such  as  HL-Resolution  (Sandford,  1977) .  In  this  aporoach 
the  syntactic  search  space  is  the  initiator  of  processing  and 
individual  syntactic  search  possibilities  are  locally  evaluated  as 
to  their  worth  by  the  model.  While  ultimately  sophisticated 
systems  will  probably  use  both  top  down  and  bottom  up  aporoaches, 
as  well  as  other  modes  of  model  utilization,  the  top  down  manner 
seems  to  be  the  most  important  for  complex  information  tasks  in 
which  there  is  a  strong  model  available. 


In  this  report  material  is  presented  which  comes  closest  to 
adequately  establishing  an  abstract  paradigm  of  model  use  in  bottom 
up  resolution  refutation  procedures.  The  top  down  use  of  models 


1.2 


Page  5 


1 . 2  MDS 

The  Meta  Description  System  (MDS)  (Srinivasan,  1976)  is  a 
system  which  was  developed  ab  initio  with  a  very  general  and  strong 
capacity  to  accept  and  use  information  in  a  manner  which  is 
semantically  oriented,  and  in  fact  constitutes  a  modeling  facility 
of  considerable  power.  In  MDS  there  is  also  the  general  capacity 
for  theorem  proving  (and  there  is  no  specific  commitment  to  the 
exact  choice  of  inference  system  employed) .  MDS  is  in  a  reverse 
position  to  most  other  general  purpose  artificial  intelligence 
systems  in  the  sense  that  its  modeling  capability  is  much  stronger 
with  respect  to  its  formal  theorem  proving  capability  than  is 
usual.  In  particular  resolution  theorem  proving  programs  are 
virtually  without  exception  extremely  weak  in  modeling  capability. 

This  report  presents  a  bottom  up  view  of  the  relationship 
between  a  formal  theorem  proving  domain  and  a  model  for  that 
domain,  which  is  applicable  to  both  MDS  and  to  semantically 
oriented  resolution  refutation  procedures.  In  particular  if  the 
theorem  proving  component  of  MDS  is  taken  to  be  a  resolution 
refutation  theorem  prover,  then  this  report  can  be  viewed  as 
specifying  the  formal  characteristics  the  modeling  components  of 
MDS  must  have  to  ensure  refutation  completeness  of  the  theorem 
proving  component. 


1 . 3  Model  Schemes  Instead  of  Models 


One  of  the  difficulties  in  utilizing  models  for  efficient 
information  processing  is  the  difficulty  in  finding  a  good  model, 
and  then  specifying  the  model  in  a  manner  in  which  it  can  actually 
be  used  by  an  automated  reasoning  system.  This  report  finesses 
these,  and  other  problems,  by  taking  a  theoretical  view  of  models 
which,  while  not  dealing  with  the  problem  of  model  finding  or  with 
pragmatic  computation  of  model  use  directly,  does  frame  things  in  a 
manner  which  makes  these  problems  pragmatically  approachable. 

The  fundamental  notion  is  that  a  model  is  actually  a  model 
scheme.  A  model  scheme  is  a  class  of  interpretations  which  is 
represented  by  a  set  of  logical  statements.  By  doing  this  the 
following  happens: 

1.  3y  using  a  class,  wa  obtain  partial  freedom  from 
introducing  a  semantic  bias  in  the  model  when  the  available 
semantic  information  is  insufficient  to  choose  a  unique 
interpretation; 

2.  By  representing  the  class  by  a  set  of  logical  statements 
we  have  a  natural  way  of  expressing  models  which  are  to  be 
utilized  to  guide  a  theorem  proving  search  (since  much  of  the 
attendant  techniques,  and  not  unimportantly  viewpoints,  are 
already  available). 


Chapter  2  presents  these  notions  is  greater  detail. 


2.0 


Page  7 


2 . 0  Introduction 

This  chapter  develops  the  formal  paradigm  of  the  use  of  model 
schemes  in  resolution  refutation  procedures.  This  chapter  will 
also  appear  in  a  forthcoming  (1978)  thesis  by  the  author. 

The  basic  structure  as  presented  in  the  remainder  of  this 
chapter  is  centered  around  the  situation  where  L  is  a  first  order 
language  in  which  a  set,  S,  of  clauses  is  expressed,  and  it  is 
necessary  to  find  a  refutation  of  S.  The  language  L'  would  then  be 
a  language  for  a  model  scheme,  and  a  set  of  first  order  formula, 
'ASF,  would  be  a  scheme  defining  set  of  statements  written  in  the 
language  L'.  The  syntactic  search  space  is  generated  in  the 
language  L  according  to  a  semantic  resolution  refinement  (e.g. 
HL-Rosolution) ,  and  during  the  expansion  of  the  search  space  the 
model  scheme  is  called  upon  to  perform  truth  evaluations  of  various 
sets  of  literals.  The  actual  content  of  the  search  space  is  a 
function  of  what  the  truth  evaluations  actually  are,  which  in  turn 
are  a  function  of  the  actual  model  scheme  used.  The  method  used 
for  making  truth  evaluations,  and  the  properties  of  the  entire 
theorem  proving  system  are  presented  at  an  abstract  level  in  the 
following  sections. 


2.1 


Page  3 


2 . 1  General  Definitions  and  Uomenclatur o 

The  notion  of  a  first  order  language  will  bo  taken  as 
primitive  here,  and  is  to  be  the  standard  notion  of  a  first  order 
language  found  in  logic  texts  (e.g.  (Kleene,  1967)).  This  report 
is  concerned  more  with  the  semantics  of  languages  than  with  the 
exact  syntactical  structure  of  languages.  In  later  sections  some 
attention  is  paid  to  syntax  in  the  context  of  the  relationship 
between  languages.  Most  of  the  time,  when  it  is  necessary  to 
consider  syntax,  we  will  assume  that  (sets  of)  well  formed  formulas 
(wff's)  have  been  put  into  conjunctive  normal  form,  as  is  usual  in 
resolution.  The  definitions  of  terms,  atoms  ana  literals  are  all 
standard  (Nilsson,  1971)  (Chang  and  Lee,  1973)  .  Unless  otherwise 
stated  all  languages  considered  are  first  order  languages. 

If  L  is  a  first  order  language,  then  we  denote  the  Herbrand 
universe  of  L,  which  is  the  set  of  all  terms  of  L,  by  HU(L),  and  wo 
denote  the  atom  set  of  L,  also  called  the  Herbrand  base  of  L,  by 
H3(L).  H3(L)  is  the  (usually  infinite)  set  of  all  possible  ground 
atoms . 

A  Herbrand  interpretation  (HI)  for  the  language  L  is  a  set  of 
ground  literals  such  that  every  element  of  H3(L)  appears,  either 
negated  or  unnegeted  in  the  interpretation,  and  no  element  of  H3(L) 
appears  both  negated  and  unnegated  in  the  interpretation.  The  set 
of  all  possible  Herbrand  interpretations  of  the  language  L  is 
wr i tten  as  H (L) . 


2.1 


Page  9 


If  k  is  a  literal  of  L,  then  by  3k  is  meant  the  same  literal 
but  with  its  negation  status  inverted,  i.e.  k  has  a  negation  sign 
on  it  iff  9k  does  not  have  a  negation  sign  on  it. 

A  literal  (wff,  clause,  etc.)  is  said  to  be  ground  if  it 
contains  no  variable  symbols  within  it.  If  x  is  a  literal  (wff, 
clause,  etc.)  and  TAU  is  a  substitution,  then  TAU  is  said  to  be  a 
grounding  substitution  for  x  iff  x(TAU)  is  ground. 


Evaluation  Function 

A  clause  is  often  considered  as  a  set  of  literals  for  the 
purpose  of  deciding  the  truth  value  of  the  clause  in  some  given 
interpretation.  Let  h  be  some  HI  for  the  language  L.  If  K  is  a 
set  of  ground  literals  of  L,  then  K  is  said  to  be  false  in  h  iff 
K  f\  h  is  empty.  Otherwise  K  is  true  in  h.  We  write  h(K)  to 
designate  the  truth  value  of  the  set  of  ground  literals,  K,  i.e. 
we  assume  that  there  is  a  function,  h,  which  maps  sets  of  ground 
literals  to  their  truth  value  in  the  interpretation  h. 

Vie  extend  the  function  h  to  apply  to  general  level  literals 
(i.e.  literals  in  which  free  variables  occur)  as  follows.  Let  K 
be  a  set  of  literals  in  the  language  L,  and  h  some  HI  for  L,  then 


h  ( K )  = 


iff  there  exists  a  grounding  substitution, 
TAU,  for  K  such  that  (K(TAU))^  h  is  empty. 


otherwise . 


i 

s 


We  call  h  the  evaluation  function  for  the  interpretation  h. 

This  definition  of  the  evaluation  function  is  consistent  with 
the  view  of  a  set  of  literals  being  a  formula  which  is  a 
disjunction  of  the  literals  in  the  set  and  with  the  variable 
symbols  being  universally  quantified,  with  the  scope  of 
quantification  being  the  entire  set  of  literals.  Therefore  we  see 
that  the  set  of  literals  to  be  evaluated  is  being  treated  as  if  if 
was  a  clause.  Thus  this  definition  of  the  evaluation  function  is 
the  usual  one  used  in  resolution  strategies.  The  reason  that 
evaluation  functions  are  not  specifically  defined  to  have  clauses 
as  the  domain  of  definition  is  that  there  are  resolution  strategies 
which  utilise  evaluations  of  sets  of  literals,  as  we  have  defined 
them,  but  for  which  the  sets  of  literals  do  not  correspond  to  any 
clauses  of  the  search  space. 

The  Tod el  Strategy 

The  Model  Strategy  (TVS)  (Luckhan,  196S)  is  a  sound  and 
complete  refinement  of  unrestricted  binary  resolution.  In  TVS 
clauses  are  considered  as  sets  of  literals  and  some  arbitrarily 
chosen  but  fixed  HI,  h,  is  used  for  truth  evaluations.  Two  clauses 
are  allowed  to  resolve  together  iff  they  could  be  resolved  together 
in  unrestricted  resolution  and  at  least  one  of  the  clauses  has  the 


truth  evaluation  "false"  in  h. 


2.1 


Page  11 


Hereditary-Lock  Resolution 

Hereditary-Lock  Resolution  (HL-Resolution ,  or  HLR) 
(Sandford,  1977)  is  a  sound  and  complete  refinement  of  unrestricted 
resolution  which  combines  both  the  Lock  Resolution  (Boyer,  1971) 
and  TMS  refinements.  HLR  is  a ''refinement  whose  semantic  component. 


like  that  of 

TMS, 

is  confined 

to 

being  a 

function 

solely 

of 

the 

truth  evaluations 

of 

sets 

of 

literals 

with 

respect 

to 

some 

arbitrarily 

chosen 

but 

fixed 

HI 

.  Thus 

TMS 

and 

HLR  are 

quite 

similar  with 

respect  to 

their 

inter  fac ing 

with 

the 

process 

of 

truth 

evaluations  by  some  interpretation. 


Semantic  Resolution 

Semantic  Resolution  (SR)  (Slagle,  1967)  is  a  sound  and 
complete  refinement  of  resolution  which  is  clash  oriented,  and 
includes  some  literal  ordering.  The  SR  refinement,  like  TMS  and 
HLR,  has  a  semantic  component  which  is  solely  a  function  of  the 
truth  evaluations  of  sets  of  literals. 


Semantic  Str ateg ies  and  Functions 

We  call  TMS,  HLR,  and  SR,  literal  set  semantic  refinements.  A 
literal  set  semantic  refinement  is  a  refinement  in  which  the 
semantic  (or  model,  or  interpretation)  based  component  of  the 
refinement  is  only  a  function  of  the  truth  evaluations  of  sets  of 
literals  according  to  some  function  whose  values  are  "true"  or 
"false".  This  function  is  called  the  semantic  function  of  the 
strategy.  We  define  a  semantic  function  for  the  language  L  to  be 


any  total  function  on  sets  of  literals  from  L  into  { tr ue , f al se } . 


TM3,  HLP. 

and  SR  are 

all 

sound 

refinements,  independently 

o  f 

what 

semantic 

function 

is 

used  . 

These  three  strategies 

are 

each 

complete 

for  the  class 

of  clause  sets  in  the  language  L 

if 

the 

semantic 

function 

is 

taken 

to  be  the  evaluation  function 

for 

some 

SI,  h,  such  that  h  £  H(L)  . 

False  Permissive  Semantic  Functions 

Let  s  and  s'  be  two  semantic  functions  for  the  language  L,  and 
let  K  range  over  sets  of  literals  from  L.  Then  s’  is  said  to  be 
false  permissive  with  respect  to  s  iff 

.FA.  K:  ( s (K)  =  false)  - >  s'(K)  =  false 

and  s'  is  said  to  be  properly  false  permissive  with  respect  to  s 
iff 

.TE.K:  s ( K )  =  true  3nd  s'(K)  =  false  and  s'  is  false 

permissive  with  respect  to  s. 

Clearly  the  relation  of  false  permissive  is  transitive,  as  is 
the  relation  of  properly  false  permissive. 

Sound  Sernant ic  Funct ions 

A  semantic  function,  s,  for  the  language  L  is  said  to  be  sound 
iff  there  exists  a  h  £  H(L)  such  that  s  is  false  permissive  with 
respect  to  h.  It  is  the  case  that  if  s'  is  false  permissive  with 
respect  to  s ,  and  s  is  sound ,  then  s'  is  also  sound . 


Page  13 


False  Permissive  Complete  Refinements 


A  literal  set  semantic  refinement  is  said  to  be  false 
permissive  complete  iff  it  is  refutation  complete  over  the  class  of 
clause  sets  of  the  language  L  for  every  sound  semantic  function  for 


TMS,  HLR  and  SR  are  all  false  permissive  complete.  We  do  not 
prove  this  in  this  report.  The  proof  of  this  is  trivial  provided 
that  deletion  strategies  (Chang  and  Lee,  1973)  (Nilsson,  1971)  are 
not  used.  The  completeness  of  HLR  using  both  deletion  strategies 
and  false  permissive  semantic  functions  will  be  dealt  with  in 
future  work.  The  completeness  of  TtlS  and  SR  with  deletion 
strategies  and  false  permissive  semantic  functions  is  an  issue  that 
has  not  been  explored. 


2.2 


Page  14 


2 . 2  Basic  Model  Scheme  Nomenclature 
First  Order  Languages 

A  first  order  language  will  be  specified  by  an  ordered  pair, 
the  first  element  of  which  is  the  set  of  predicate  symbols,  and  the 
second  element  of  which  is  the  set  of  function  symbols  of  the 
language.  Thus  the  language  L  has  a  description  which  we  write  as 

DESC(L)  =  <PRED(L)  ,FUN(L)  > 

where  PPED  and  FUN  are  the  functions  which  map  languages  to  their 
sets  of  predicate  and  function  symbols,  respectively.  We  only 
consider  languages,  L,  where  PRED(L)  is  non-empty.  Each  predicate 
and  function  symbol  has  associated  with  it  a  finite  non-negative 
integer,  called  its  arity.  We  only  consider  languages,  L,  such 
that  FUM(L)  contains  at  least  one  function  symbol  of  arity  zero, 
and  for  which  PRED(L)  and  FUN(L)  ere  both  finite  sets.  Function 
symbols  of  arity  zero  are  called  constants.  A  function  (predicate) 
symbol  of  arity  !<  is  also  called  a  k-place  function  (predicate). 

We  assume  the  existence  of  an  infinite  set  of  variable 
symbols,  and  all  languages  share  this  same  set  of  variable  symbols 
implicitly.  Also  implicitly  contained  in  every  language 
description  are  the  usual  logical  symbols  for  negation, 
quantification,  "implies",  "and",  "or",  and  any  others  that  are 
desired  and  can  be  defined  in  terms  of  those  just  explicitly 
mentioned.  The  definitions  of  terms,  atoms,  litorals  and  well 
formed  formulas  are  all  standard.  All  unbound  variable  symbols  are 
to  be  understood  as  universally  quantified,  except  where  explicitly 


2.2 


Page  15 


noted  differently.  When  writing  formulas  the  letters 

x ,y , z , u , v ,w,q , r , s , t , xl ,yl ,  .  .  .,x2,y2,  .  .  . 
will  be  variable  symbols. 

Most  of  the  time  we  will  consider  expressions  of  theorem 
proving  problems  to  have  been  already  converted  to  conjunctive 
normal  form  as  is  usual  in  resolution. 


Language  Amicability 


Let  L  and  L'  be  two  first  order  languages.  Then  L'  is  said  to 
be  amicable  to  L  iff,  for  all  k,  if  there  exists  a  !<-place 
predicate  symbol  in  PP.ED(L)  then  there  exists  at  least  one  k-place 
predicate  symbol  in  PRED(L')r  and  if  there  exists  a  k-place 
function  symbol  in  FUN(L)  then  there  exists  at  least  one  k-place 
function  symbol  in  FUN(L'). 


Translation  Functions 

A  symbol  translation  function  from  L  to  L',  usually  denoted  by 
" SYMT" ,  is  any  function  which  for  all  k,  is  a  total  function  on  the 
set  of  predicate  (function)  symbols  of  L  of  arity  k  into  the  set  of 
predicate  (function)  symbols  of  L'  of  arity  k.  There  can  be  a 
symbol  translation  function  from  L  to  L'  iff  L'  is  amicable  to  L. 
Notice  that  SYMT  need  not  be  a  1-1  mapping  and  that  it  need  not  be 
onto.  We  define  SYMT (ALPHA)  =  ALPHA  when  ALPHA  is  any  variable 


symbol,  or  the  negation  sign. 


2.2 


In  the  obvious  way,  SYMT  induces  a  term  translation  function 
(TTF) ,  and  a  literal  translation  function  (LTF) .  A  TTF  applied  to 
a  list  of  terms  has  as  its  value  the  list  of  terms  obtained  by 
applying  TTF  to  each  individual  term  in  the  list.  Similarly  for 
LTF  applied  to  a  set  of  literals. 

As  an  example,  if  the  SYMT  from  L  to  L'  is  given  by 

SYMT  (  f )  =  g 
SYMT (h)  =  g 
SYMT(P)  =  Q 
SYMT (a)  =  b 

then  the  set  of  literals 

{ P  (a ,  f  ( x , y)  )  , 3 P  ( h  ( y , a )  ,?.)  } 
translates  under  the  induced  LTF  to 

{Q (b,g ( x , y) ) ,  3Q(g(y,b) ,b)  } 

We  further  extend  LTF  to  apply  to  substitutions  of  L,  such  that  if 
PHO  is  a  substitution  of  L,  then 

LTF(RHC)  =  {TTF(t)/v:  t/v  is  an  element  of  RHC} , 

where  v  is  a  variable  symbol  and  t  is  a  term  from  HU(L).  Thus  LTF 
macs  a  substitution  of  L  to  a  substitution  of  L'. 


2.2 


Page  17 


The-  Language  L\L 1 

Assume  L'  is  amicable  to  L  and  some  SYMT  has  been  chosen.  Let 
the  subset  of  PRED(L')  which  is  actually  in  the  range  of  SYMT  from 
L  to  L',  be  denoted  by  L\PRED(L').  Similarly  for  a  L\FUM(L'). 
Then  there  is  a  first  order  language  whose  description  is 

<L\PRED { L  '  )  #  L\FUN(L')>. 

r.7e  denote  this  language  by  L\L',  where  the  dependence  on  a  specific 
SYMT  is  not  explicitly  noted,  but  is  assumed  to  be  known  from  the 
particular  context  in  which  L\L'  is  being  used. 

The  Mapping  of  Interpretations 

Let  L"  be  amicable  to  L,  and  let  SYMT  be  some  specific  symbol 
translation  function.  This  induces  term  and  literal  translation 
functions  TTF  and  LTF ,  and  a  language  L\L ' . 

LTF  was  defined  so  as  to  map  sets  of  literals  from  L  to  sets 
of  literals  from  L',  but  clearly  every  set  of  literals  in  the  range 

of  LTF  is  also  a  set  of  literals  in  the  language  L\L'.  A  similar 
remark  holds  for  LTF  applied  to  substitutions. 

Let  K  be  a  set  of  literals  from  L.  Then  LTF(K)  is  called  the 
homomorphic  image  of  K,  or  just  the  image  of  K. 

Let  h  be  a  HI  of  L.  If  the  image  of  h  is  a  HI  of  L\L',  th-^n 
this  image  is  called  a  condensate  of  h.  In  those  situations  vhr,ro 
SYMT  is  not  a  1-1  mapping  there  will  exist  a  HI  for  L  whose  in?i° 
is  not  a  HI  of  L\L',  and  therefore  -whose  image  is  not  a  con^nsa*-" . 


Reduct  and  Expansion  Relationships 

Let  L'  be  amicable  to  L,  and  let  SYMT  be  a  symbol  translation 
function  from  L  to  L*.  Then,  in  general,  for  each  HI,  h,  of  L\L‘, 
there  will  be  more  than  one  HI,  h‘,  of  L’ ,  such  that  h$  h1 
(remember  that  the  predicate  and  function  symbols  of  L\L  ’  are  a 
subset  of  those  in  L’). 

Let  h  be  a  HI  of  L\L ' .  The  set 

{ h  ' :  hgh'  and  h'tf  H(L')1 

is  called  the  expansion  cluster  of  h.  For  each  h'  in  the  expansion 
cluster  of  h,  h'  is  called  an  expansion  of  h,  and  h  is  called  a 
L\L '  reduct  of  h',  or  just  a  reduct  of  h'. 

Model  Schemes 

A  model  scheme  for  a  language  L  is  any  non-empty  subset  of 
H ( L ) .  Motationally ,  a  model  scheme  for  a  language  L  will  usually 
be  written  as  M3L. 

Evaluations  Over  Model  Schemes 

Let  K  be  a  set  of  literals  in  the  language  L,  and  let  "SL  be  a 
model  scheme  for  L.  Then  we  define  a  model  scheme  evaluation 
function,  also  denoted  MSL,  for  the  model  scheme  ’’SL: 


2.2 


Page  19 


MSL(:<)  = 


"F"  iff  . TE . h :  h£  MSL  and  h(K)  =  false 


"T"  otherwise 


The  model  scheme  evaluation  of  K  over  MSL  is  the  value  of  MSL(K) 


Associated  Semantic  Function 

Let  M  be  a  model  scheme  for  the  language  L,  and  let  K  range 
over  sets  of  literals  of  L.  Then  the  function  s  defined  by 


s(K)  = 


is  called 


the  associated 


iff  K  is  "F"  over  M 

iff  K  is  "T"  over 

semantic  function  for 


M 


Let  ’1  and  M '  be  two  model  schemes  for  the  same  language.  We 
say  that  M  is  (properly)  false  permissive  with  respect  to  M'  iff 
the  associated  semantic  function  for  M  is  (properly)  false 
permissive  with  respect  to  the  associated  semantic  function  for  M' . 


Induced  Model  Schemes 

Let  L'  be  amicable 
function  from  L  to  L' 
there  is  an  induced  (or 
L\L',  defined  by 


to  L,  and  SYMT  be  a 
Then  for  every  model 
corresponding)  model 


symbol  translation 
scheme ,  MSL ' ,  of  L ' , 
scheme,  MSL\L ' ,  of 


HSL\L ' 


(m: 


m  is  the  L\L'  reduct  of  some  m'  in  MSL'} 


2.2 


Page  2 


B 


Thus  every  model  scheme  in  L*  induces  a  unique  model  scheme  in 

L\L '  . 

Let  MSL\L '  be  any  model  scheme  of  the  language  L\L ' .  Then 
there  is  an  induced  model  scheme,  MSL,  of  the  language  L,  defined 
by 

MSL  =  {m:  LTF(m)  £  MSL\L ' } 

Equivalently  we  can  say  that  MSL  is  the  subset  of  H(L)  whose 
elements  have  condensates  which  are  elements  of  MSL\L'. 

Thus  every  model  scheme  of  L\L'  induces  a  unique  model  scheme 
of  L  (with  respect  to  a  given  SYMT) .  As  a  result  of  this,  every 
model  scheme  of  L*  induces  a  unique  model  scheme  of  L.  Notice  that 
it  is  possible,  in  general,  for  two  distinct  model  schemes  in  L'  to 
induce  the  same  model  scheme  in  L\L',  and  therefore  the  same  model 
scheme  in  L.  V7e  refer  to  the  collection  of  L,  L',  L\L',  SYMT,  etc. 
as  a  model  structure  or  model  system. 

Model  Scheme  False  Pe rm i ss iveness 

Let  L'  be  a  language  amicable  to  the  language  L,  and  let  SYV'T 
be  a  symbol  translation  function  from  L  to  L'  which  induces  the 
literal  translation  function  LTF .  Let  M  be  a  model  scheme  for  L 
and  M'  a  model  scheme  for  L' .  Then  M'  is  false  permissive  with 
respect  to  M  iff  for  all  sets  of  literals,  K ,  of  L, 

M { K )  =  "F"  - >  M ' ( LTF ( K ) )  =  "F", 

and  M '  is  properly  false  permissive  with  respect  to  M  iff  it  is 


.  V 


Page  21 


false  permissive  with  respect  to  M  and  there  exists  a  set  of 
literals,  K,  of  L,  such  that 


M (K)  =  "T"  AND  M ' ( LTF ( K ) )  =  ”F' 


If  it  happens  that  M  is  the  induced  model  scheme  cor  responding 
to  M ' ,  then  we  say  that  the  model  structure  is  properly  or 
improperly  false  permissive  according  to  whether  M '  is  properly  or 
improperly  false  permissive  with  respect  to  M. 


2.3 


Page  22 


2 . 3  Basic  Model  Scheme  R.esul  ts 

Lemma  I_ 

Let  L'  be  amicable  to  L,  and  SYMT  a  symbol  translation 
function  from  L  to  L',  and  let  MSL'  be  any  model  scheme  for  L'. 
Further  let  MSL  be  the  model  scheme  in  L  induced  by  MSL',  and  l^t  C 
be  a  set  of  literals  in  the  language  L.  Then  for  all  grounding 
substitutions,  SIGMA,  such  that  C (SIGMA)  is  false  in  some  h  £  MSL, 
there  exists  at  least  one  h'  £  MSL'  such  that  LTF (C (SIGMA) )  is 
false  in  h' . 

proof :  3y  the  definition  of  how  MSL  is  induced  by  a  MSL\L',  which 
is  in  turn  induced  by  the  .MSL'  of  the  lemma  hypothesis,  it  is  the 
case  that  h  £  MSL  must  be  such  that  there  exists  an  x  £  MSL\L '  such 
that  LTF(h)  =  x.  3y  the  definition  of  a  literal  translation 
function  from  L  to  L',  it  is  the  case  that  if  C  (SIGMA)  ^  h  is 
empty,  so  is 

LTF (C (SIGMA) ) LTF (h)  *  LTF (C (SIGMA) ) ^  x. 

3ut  x  must  be  the  reduct  of  at  least  one  interpretation  in  "SL1. 
Let  h'  be  any  interpretation  which  is  both  in  MSL'  and  in  the 
expansion  cluster  of  x.  This  h'  is  an  expansion  cf  x.  Therefore 
LTF (C (SIGMA) )  will  also  be  false  in  h'. 


2.3 


Page  23 


Lemma  1 1 

Let  L,  L',  SYMT,  MSL,  MSL'  and  C  be  the  same  as  in  the 
hypothesis  of  Lemma  I.  Then  if  the  MSL  model  scheme  evaluation  of 
C  is  "F",  then  the  MSL'  model  scheme  evaluation  of  LTF(C)  is  "F". 

proof :  If  C  is  "F"  over  MSL,  then  there  must  exist  a  grounding 
substitution,  SIGMA,  of  C,  and  a  HI,  h,  in  MSL  such  that  C (SIGMA) 
is  false  in  h.  By  Lemma  I  there  is  then  some  h'  £  MSL'  such  that 
LTF (C (SIGMA) )  is  false  in  h'.  But  by  the  definition  of  a  literal 
translation  function,  it  is  the  case  that 

LTF (C (SIGMA) )  =  (LTF (C) ) (LTF (SIGMA) ) . 

Thus  there  is  a  substitution,  LTF (SIGMA),  in  the  language  L',  and 
an  interpretation,  h '£  MSL',  such  that  (LTF (C) ) (LTF (SIGMA) )  is 
false  in  h'.  Therefore  LTF(C)  is  "F"  over  MSL'. 

Lemma  III 

Let  L,  L' ,  SYMT,  MSL'  and  MSL  be  the  same  as  in  the  hypothesis 
of  Lemma  I.  Let  C  be  a  ground  set  of  literals  in  L  whose  model 
scheme  evaluation  is  "T"  over  MSL.  Then  the  model  scheme 
evaluation  of  LTF (C)  over  MSL'  is  also  "7" . 

proof :  Let  h'  be  an  arbitrary  HI  in  MSL'.  Then  there  exists  a  h'' 
in  M3L\L '  which  is  the  reduct  of  h',  and  an  h  in  MSL  such  that  h ' ' 
is  the  condensate  of  h.  By  hypothesis  C  ^  h  is  non-empty. 
Therefore  LTF(O^)  h '  '  is  non-empty.  Therefore  LTF(C)^|  h'  is 
non-empty.  Thus  LTF(C)  is  true  in  every  h'  in  MSL'  and 
consequently  LTF (C)  is  "T"  over  MSL'. 


At  this  point  in  the  development  one  would  like  to  have  a 
result  that  shows  that  for  any  non-ground  level  set  of  literals,  C, 
such  that  C  evaluates  to  "?"  over  MSL,  that  LTF(C)  evaluates  to  "T" 
over  MSI/.  However,  such  is  not  the  case  for  the  situations  we  are 
talking  about  in  this  report.  In  section  2.5  an  example  is  given 
that  illustrates  this  point. 


2.4 


Page  25 


2.4  Basic  Model  Scheme  Soecif ication 


We  now  show  how  to  specify  model  schemes  and  obtain  model 
scheme  evaluations  in  the  language  L'. 


Satisfying  Interpretation  Set 


Let  L'  be  amicable  to  L,  and  let  SYMT  be  a  symbol  translation 
function  from  L  to  L' .  Let  MSF  be  a  set  of  first  order  well  formed 
formula  in  the  language  L' .  Let  MSF  be  satisf iable .  Then  there 
exists  a  non-empty  subset  of  H(L')  which  consists  of  exactly  those 
interpretations  which  satisfy  MSF.  We  call  this  set  the  satisfying 
interpretation  set  of  MSF,  and  denote  it  by  SIS(MSF).  Py  the 
definition  of  a  model  scheme,  SIS (MSF)  is  a  model  scheme  of  L'. 

We  wish  to  be  able  to  perform  model  scheme  evaluations  of  sets 
of  literals  over  the  model  scheme  SIS(MSF). 


Theorem  I 


If  K  is  a  set  of  literals,  X  =  { k 1 , k 2 ,  .  .  .  kn},  in  the 
language  L',  and  MSF  is  a  set  of  satisfiable  formula  in  L',  then 
the  model  scheme  evaluation  of  X  is  "F"  over  SIS (MSF)  iff  the  set 


of  statements 


[MSF]  U  [33] 


is  a  satisfiable  set  of  statements.  (N.3.  The  meaning  of 


applied  to  a  set  of  literals  is  that  all  variables  are  changed  from 
universal  to  existential  quantification  and  the  negation  status  of 


each  literal  in  the  set  is  inverted,  and  the  resulting  literals  are 
to  be  treated  as  a  conjunction.) 

proof : 

(=>)  Assume  X  is  "F"  over  SIS(MSF).  Then  there  exists  an 
interpretation,  h',  in  SIS(MSF),  and  a  substitution,  RHO,  such  that 

{  (kl(RHO) )  , (k2(RHO) )  ,  .  .  .  (kn(RHO))} 

are  all  false  ground  literals  in  h’.  But  then  the  set  of  literals 

t@(kl (RHO) ) ,3 (k2(RHO) ) ,  .  .  .  Q (kn (RHO) ) } 

is  a  set  of  ground  literals  all  of  which  are  true  in  h'.  Thus  the 
set  of  literals  3(K(RHC))  is  true  in  h',  and  therefore  the  sot  of 
literals  3X  is  true  in  h'.  Since  MSF  is  true  in  h'  by  the 
definition  of  h'  as  an  element  of  SIS(KSF),  {MSF }\J  {IX}  is  true  in 
h',  and  is  therefore  satisfiable. 

(<=)  Assume  {MSF}  U  {?X}  is  satisfiable.  Then  there  exists  a  HI, 
h,  which  is  an  element  of  SIS(MSF),  and  for  which  there  exists  a 
substitution,  RHO,  such  that 

9(kl (RHO) ) ,3(k2(RHO) ) .  .  .  .  3 (kn (RHO) ) 
is  a  set  of  literals  all  of  which  are  true  in  h.  Thus  the  sot  of 
literals 

K { RHO)  =  { kl (RHO) ,k2 (RHO) ,  .  .  .  kn(RHC)} 
is  a  set  such  that  every  literal  in  it  is  false  in  h.  Thus  there 
exists  a  substitution,  RHO,  and  a  MI,  h,  such  that  X(RHO)  is  false 
in  h.  Thus  the  model  evaluation  of  X  over  the  model  scheme 
3IS(,M3F),  which  contains  h,  is  "F". 


At  this  point  we  have  connected  the  notion  of  model  scheme 
evaluation  to  the  notion  of  satisfiability  of  the  negation  of  the 
set  of  literals  being  evaluated  and  the  set  of  formula  specifying 
the  model  scheme,  all  in  the  language  L' .  Thus  the  model  scheme 
evaluations  can  be  computed  by  any  decision  procedure  for 
satisfiability  testing.  For  ease  of  reference  we  call  the  model 
scheme  evaluation  in  L'  of  a  set  of  literals  over  the  SIS (MSF) 
scheme  as  either  the  MSF  evaluation  or  the  L'  evaluation. 


Depending  upon  the  particular  language,  L',  and  the  particular 
set  of  scheme  defining  statements  MSF,  there  may  ot  may  not  exist  a 
decision  procedure  for  satisfiability  testing  for  the  class  of  sets 
of  literals  we  are  interested  in  evaluating.  The  class  of  sets  of 
literals  that  we  will  want  to  evaluate  is  itself  dependent  upon  the 
language  L,  the  chosen  SYMT,  the  set  of  input  statements  in  L,  and 
the  particular  strategy  being  employed  for  theorem  proving  in  L. 
Thus  it  would  seem  in  general,  to  be  difficult,  if  not  actually 
impossible  to  ensure  that  some  arbitrarily  established  model  scheme 
would  be  tractable.  While  this  is  true,  it  causes  considerably 
less  difficulty  than  one  would  at  first  think.  Why  this  is  the 
case  is  the  main  result  of  this  report,  and  we  now  continue 
directly  to  the  development  of  that  result. 


2.5 


Page  28 


2.5  A.  Sound  and  Complete  Refinement  Using  Incomplete  Model  Schemes 


Theorem  II 


Let  L  be  a  first  order  language  and  M5L  be  any  model  scheme 
for  L.  Let  R  be  a  false  permissive  complete  resolution  strategy, 
with  semantic  function  h  defined  by: 


for  all  sets  X  of  literals  in  L, 


h(K)  = 


false 


true 


iff  X  is  "F"  over  the  model  scheme  MSL. 


iff  X  is  "T"  over  the  model  scheme  MSL. 


(i.e.  h  is  the  associated  semantic  function  for  V3L) .  Then  R  will 
be  a  complete  refinement  strategy  when  using  this  h  as  its  semantic 
function . 


oroof :  Since  R  is  false  permissive  complete,  one  need  merely  show 
that  h  as  defined  above,  is  a  sound  semantic  function.  This  can  be 
done  by  showing  that  h  is  false  permissive  with  respect  to  some 
h'  £  H(L),  as  follows.  Let  h'  be  any  fixed  element  of  MSL.  Then 
for  any  set  of  literals,  K,  if  h'(K)  is  false,  then  M3L(X)  =  "F", 
and  by  the  definition  of  h  given  in  the  theorem,  h(X)  is  false. 
Thus  h  is  false  permissive  with  respect  to  the  HI  h'.  Thus  h  is  a 
sound  semantic  function. 


We  arc  now  ready  for  the  main  theorem  of  this  report. 


2.5 


ago  2  9 


Theorem  III 

Let  R  be  a  false  permissive  complete  resolution  refinement  for 
the  language  L.  Let  L*  be  any  language  amicable  to  L,  and  LTF  a 
literal  translation  function  from  L  to  L' .  Let  R'  be  a  sound 
refutation  procedure  for  the  language  L',  which  is  not  necessarily 
complete,  and  which  always  terminates  in  a  finite  amount  of  time 
when  evaluating  a  set  of  statements  (reporting  only  refutation 
success  or  failure).  Let  MSF  be  any  set  of  satisfiable  statements 
in  the  language  L'.  Let  s  be  a  semantic  function  defined  for  sets 
of  literals  of  L  by: 


s  C<) 


f  false 


^true 


iff  R'(3(LTF(K))  U  MSF)  =  failure 
iff  R' (3 (LTF(K) U  MSF)  =  success 


Then  R  will  be  a  complete  strategy  when  using  s  as  its  semantic 
function . 


proof :  It  is  only  necessary  to  show  that  s  is  a  sound,  semantic 
function.  First  we  notice  that  MSF  induces  a  model  scheme, 
*-!  3  L '  =  SIS  (MSF),  of  L',  and  a  cor  r  espond  ing  model  scheme,  MSL,  of 
L.  Consider  the  semantic  function  h,  as  defined  in  the  previous 
theorem.  It  was  shown  to  be  sound.  Consider  the  semantic  function 
s*,  defined  on  sets  of  literals,  K,  of  L,  by: 


A 


3y  Lemma  II  we  can  assert  that  s*  is  false  permissive  with  respect 
to  h.  3ut  by  Theorem  I  and  the  soundness  of  R ' ,  it  is  the  case 
that  s  is  false  permissive  with  respect  to  s*.  Therefore  s  is 
false  permissive  with  respect  to  h.  3ut  h  is  a  sound  semantic 
function.  Thus  s  is  a  sound  semantic  function. 

Theorem  III  is  a  crude  but  interesting  result.  What  we  have 
achieved  is  the  establishment  of  a  structure  which  allows  a 
complete  theorem  proving  search  process  to  be  guided  by  an 
interpretation  environment  which  is  itself  incomplete.  The 
completeness  of  the  theorem  proving  process  rests  only  on  the 
soundness  of  the  interpretation  process.  If  the  semantic  strategy 
used  is  TM3,  HLR,  or  SR,  then  the  theorem  proving  process  is  also 
sound  . 


2 . 6  An  Example  of  a  Basic  Model inq  Structure 

Our  example  uses  a  language  of  the  clauses,  L,  and  a  language 
of  the  model,  L',  which  are  both  such  that  they  each  have  only  a 
finite  number  of  distinct  Hi's. 

We  consider  a  language  L  which  has  one  two  place  predicate 
symbol  and  two  constants: 

DESC(L)  =  <{={-,-) },{cl,c2}>. 

Thus  the  Herbrand  universe  of  L  (HU(L))  is  just  the  set  consisting 
of  the  two  constants.  The  Herbrand  base  of  L  (H3(L))  is: 

H3<L)  =  {-(cl, cl) ,  =  (cl,c2) ,=  (c2,cl) ,-(c2,c2)  } 

Thus  there  are  16  possible  Herbrand  interpretations  for  L. 

We  will  set  up  a  model  in  the  language  L'  such  that  the 
induced  model  scheme  in  L,  MSL,  will  be  the  singleton  set 
consisting  of  the  following  interpretation: 

h  =  {-(cl, cl) ,-(c2,c2) ,9-{cl,c2) ,9-(c2,cl) } 

Thus  we  will  have  MSL  =  {h}. 

We  choose  L'  to  be  specified  by: 

DESC(L')  =*  <  {  R  (- , -)  }  ,{kl,k2,k3}>. 

The  set  HU(L’)  has  the  three  constants  as  its  only  members  and  the 
set  H3 ( L ' )  has  9  ground  atoms  in  it.  Therefore  there  are  512 
possible  Hi's  for  L'. 


We  choose  the  following  set  of  clauses  as  the  defining  set, 
MSF,  of  clauses  for  the  model  scheme,  MSL ' : 

1.  3  R  (  k  1 ,  k  2 ) ; 

2.  9R(kl#k3); 

3.  §R(k2,k3); 

4  .  P.  ( x  ,  x )  ; 

5.  @R(x,y) ,R(y,x) ; 

6.  5  R ( x , y )  , 3  R ( y , z )  ,  R  ( x  ,  z ) ; 

7.  R(x,kl)  ,R(x,k2) , E ( x , k  3 ) ; 

What  we  have  done  here  by  using  clauses  4,  5  and  6,  is  to  allow 
only  interpretations  for  L'  which  treat  R  as  an  equivalence 
relation  over  the  elements  of  HU(L').  Then  clauses  1,  2  and  3, 

allow  only  interpretations  for  which  kl,  k2,  and  k3  are  all 
inecui valent .  Finally  clause  7  restricts  the  allowed 

interpretations  to  those  in  which  every  HU  element  is  equivalent  to 
one  of  kl,  k2  or  k3.  There  is  exactly  one  HI  for  L'  which 
satisfies  USF.  We  call  this  interpretation  h * : 

h'  =  { R ( k 1 , k 1 ) , R ( k  2 , k  2 )  , R ( k  3 , k  3 ) ,  1 R  ( k  1 ,  k  2 )  , 

3  R  ( k  1 ,  k  3 ) , 3  R ( k  2 , k 1 ) , 9  R ( k  2 , k  3 )  ,  ?R(k3,kl)  , ?P(k3,k2) } 

Thus  we  have  the  following: 

MSL '  =  SIS (MSF)  =  { h ' } 

The  function  SYMT  is  chosen  to  be: 

SYMT  ■  {<cl,kl>,<c2,k2>,<=,P>} 


The  HU,  H9,  etc.,  for  the  language  L\L'  are  all  identical 


o  the 


2.6  Pago  33 

ones  for  L  except  that  kl  replaces  cl,  !<2  reolaces  c2,  and  R 
replaces  Thus,  for  example,  MSL\L'  would  be  the  singleton  set 
{ h* } ,  where 


h*  *  { R  ( k  1 ,  k  1 )  ,  R  { k  2 ,  k  2 )  ,‘3R(kl,k2)  ,  9  R  ( k  2 ,  k  1 )  } 

The  reader  should  be  able  to  verify  that  MSF  is  satisfied  only  by 
h'  in  H(L'),  and  that  h*  is  the  reduct  of  h',  and  that  h*  is  the 
condensate  of  h.  Thus  h  and  h'  are  corresponding  interpretations, 
and  given  that  MSL '  =  { h ’ }  it  is  the  case  that  MSL  =  {h}. 

We  now  exhibit  some  clause  evaluations.  Where  it  is  otherwise 
unclear,  sets  of  literals  that  form  a  clause  are  underlined.  First 
consider  the  clause  C,  in  the  language  L,  where 

C  =  =  (x  ,cl)  ,  =  (x  ,c2)  ; 

It  is  the  case  that  h(C)  =  true,  and  thus  MSL(C)  *  "T".  -Mso  it  is 
the  case  that  h*(LTF(C))  =  h* ({R(x,kl) ,R(x,k2) })  =  true.  Thus 
HSL\L ' (LTF (C) )  =  "T". 

On  the  other  hand 

h ' (LTF (C) )  =  h1 ( { R (x , kl ) , R  ( x , k  2 )  })  =  false 
since  the  substitution  TAU  *  {k3/x}  is  such  that 

h ' ( (LTF (C) ) (TAU) )  =  h* ({R(k3,kl) ,P(k3,k2) 1)  =  false. 

Thus  MSL'(LTF(C))  =  "F".  \le  see  that  the  model  scheme,  M3L',  even 


though  it  consists  of  just  one  HI, 
permissive  with  respect  to  MSL. 


is  still  properly  false 


An  example  of  a  clause  that  is  T"  over  both  MSL  and  MGL’  is 
the  clause  D: 

D  =  =(x,cl) ,=(x,c2) ,=(y,cl) ,=(y,c2) ,=(x,y) ; 

It  should  be  clear  that  MSL(D)  =  "T"  since  every  grounding 
substitution  for  D  in  the  language  L  must  substitute  either  cl  or 
c2  for  x,  and  thus  either  the  first  or  second  literal  of  D  will  be 
true  in  h.  To  show  that  MSL ' (LTF (D) )  =  "T" ,  i.e.  that 
h'(LTF(D))  =  true,  we  show  that  3(LTF(D))U  MSF  is  unsatisf iable , 
as  follows. 

Me  have  as  the  clause  set  7  clauses  from  MSF,  listed  above, 
and  5  more  clauses,  numbered  3  through  12  which  are  from  3(LTF(D)): 


3. 

3  P  ( a ,  k  1 ) 

9. 

3R(a,k2) 

10. 

3  P  ( b ,  k  1 ) 

11 . 

3  P.  ( b ,  k  2 ) 

12. 

IP (a ,b) ; 

The  constants  a  and  b  are  zero  place  Skolem  functions  introduced  by 
the  process  of  negating  the  set  of  translated  literals.  The  reader 
should  keep  in  mind  that  this  set  of  12  clauses  is  not  in  the 
language  L',  since  L'  does  not  contain  either  a  or  b.  However  this 
set  of  12  clauses  is  unsatisf iable  iff  3(LTF(D))  (J  "SF  is 
unsatisf iable . 


A  refutation  of  this  set  of  12  clauses  is: 


7x8 

3 

13. 

R (a , k2)  , R (a , k3 )  ; 

9x13 

14  . 

R (a , k3)  ; 

7x10 

= 

15. 

R(b,k2)  ,R(b,k3)  ; 

11x15 

- 

16. 

P(b,k3)  ; 

5x16 

= 

17. 

R(k3,b) ; 

6x14 

= 

13. 

3R(k3,z)  ,  R  (a  ,  z)  ; 

17x13 

= 

19. 

R(a,b)  ; 

12x19 

3 

20  . 

*30X* ; 

Thus  3(LTF(D))U  MSF  is  unsat i sf iable ,  and  it  is  the  case  that 
MSL' (LTF (D) )  =  "T". 

An  example  of  a  clause  which  is  "F"  over  both  MSL  and  MSL'  is 
clause  E: 


£=  3= (x,y) ,=(x,cl) ,= (y,cl) ; 

which  is  false  in  h  (the  falsifying  substitution  is  {c2/x,c2/y}, 
and  MSL '  (LTF  (E)  )  =  ilSL 1  ({@R(x,y)  ,R(x,kl)  ,R(y,kl) })  =  "F"  since  the 
set  of  clauses 

{  {  R  ( c  , d )  }  ,  {  3  R  ( c ,  k  1 )  }  ,  {  3  R  ( d  ,  k  1 )  }  }  [J  MSF 
is  satisfiable  (where  c  and  d  are  again  new  Skolem  constants 
introduced  by  negating  LTF(E)). 


3.0 


Page  35 


3  . 0  Discussion 

Chapter  2  has  presented  an  abstract  structured  situation  in 
which  a  sound  and  complete  theorem  proving  search  can  be  guided  in 
a  bottom  up  fashion  'oy  a  sound  but  incomplete  model. 

The  incompleteness  of  the  model  has  two  distinct  sources.  One 
is  simply  that  we  wish  to  be  unbiased  in  those  situations  where  the 
available  information  is  insufficient  to  specify  a  single 
interpretation.  This  is  a  pragmatic  choice  which  leads  naturally 
to  the  notion  of  a  model  scheme  as  a  set  of  interpretations  each  of 
which  satisfies  a  common  set  of  conditions  (the  available  model 
specification  as  a  set  of  scheme  defining  statements,  Me?,  in  the 
model  language  L').  The  second  source  of  model  incompleteness  is 
related  to  the  question  of  efficiency  and  practicality  of 
performing  model  evaluations.  We  are  willing  to  permit  an  increase 
in  the  degree  of  false  permissiveness  in  the  model  evaluations  to 
obtain  a  reduction  in  the  average  amount  of  effort  expended  in 
performing  model  evaluations.  In  these  cases  where  there  exists  no 
decision  procedure  for  the  model  evaluations  this  increased  false 
permissiveness  is  mandatory.  In  other  cases  it  is  merely  a 
tradeoff  designed  to  reduce  total  theorem  proving  effort.  This 
tradeoff  arises  because,  for  the  false  permissive  complete 
strategies,  as  the  model  evaluation  effort  is  reduced  by  allowing 
an  increase  in  the  degree  of  false  permissiveness  of  the  semantic 
function,  the  syntactic  search  space  contains  an  increasing  number 
of  clauses  at  each  level.  A  theoretical  analysis  of  this  tradeoff 
seems  well  beyond  current  capabilities,  and  an  understanding  of  the 


3.0 


Page  3  7 


dynamics  of  this  situation  will  have  to  be  sought  in  empirical 
investigations.  For  a  discussion  of  model  incompleteness  end  the 
difficulties  it  causes  in  problem  solving  searches  expressed  in  the 
PLANNER  formalism  see  Moore  (Moore,  1975).  These  difficulties  do 
not  arise  in  the  context  of  the  theorem  proving  formalism  using 
false  permissive  complete  semantic  refinements. 

In  the  example  given  in  section  2.5  we  saw  a  modeling 
structure  which  was  demonstr atably  false  permissive  for  clause  C  in 
the  sense  that  MSL(C)  =  "T"  while  MSL'(LTF(C))  ■  "F".  This  false 
permissiveness  is  not  a  result  of  indefiniteness  as  to  which 
interpretation  is  to  be  used,  since  the  model  schemes  MEL  and  MGL' 
each  consisted  of  just  one  interpretation.  Neither  was  it  a  result 
of  incompleteness  of  the  satisfiability  testing  inside  the  model 
scheme  MSL' .  Its  origin  is  the  fact  that  L'  is  a  language  which 
has  a  more  extensive  Herbrand  universe  than  does  L.  As  a  result, 
when  using  the  particular  SYMT  of  the  example,  a  statement  with 
universally  quantified  variables  may  be  true  in  L,  but  its 
translated  form  in  L*  is  not  necessarily  true  in  the  larger 
Herbrand  universe  of  L' .  This  is  a  specific  case  of  a  more  general 
phenomenon  of  model  structures  as  presented  ip.  this  report.  This 
phenomenon  is  that  the  model  scheme  in  L'  can  represent  ar.d 
manipulate  individuals,  functions,  and  predicates  which  are 
inexpressible  in  the  language  L.  It  is  this  increased  capacity  of 
the  model  to  represent  information  that  gives  rise  to  the  expansion 
clusters.  Vie  can  think  of  the  model  as  dealing  only  with  an 
abstraction  of  the  original  problem ,  but  dealing  with  that 
abstraction  in  a  context  which  brings  in  constructs  which  are 


3.0  33 

unobservable  in  the  original  problem  structure.  The  purpose  of 
including  these  unobservable  constructs  is  to  give  the  model 
sufficient  structure  so  that  it  may  be  efficiently  manipulated  in 
performing  evaluations.  We  avoid  speculation  here  on  the  ultimate 
role  and  utility  of  unobservables  in  the  type  of  modeling 
situations  we  are  concerned  with  in  this  report. 


We  now  discuss  some  issues  concerning  the  practical 
application  of  the  structures  presented  in  this  report.  It  should 
already  be  clear  that  the  efficiency  of  the  model  evaluation 
process  is  of  importance.  Model  schemes  have  been  abstractly 
described  as  sets  of  formula,  MSF ,  in  L' .  In  the  typical  situation 
the  structure  of  these  formula  will  be  such  that  it  will  be 
prohibitive  to  perform  model  evaluations  in  L'  as  a  theorem  proving 
type  computation  using  MSF.  Thus  one  criteria  in  establishing  a 
model  scheme  is  to  make  sure  that  there  will  exist  a  practical 
evaluation  mechanism.  This  is  a  largely  uninvestigated  area  at 
present,  but  some  of  the  possibilities  are  clear.  One  possibility 
is  that  MSF  constitutes  a  decidable  theory  in  L',  and  there  is  an 
efficient  decision  procedure  known.  \n  example  of  this  is  the 
theory  of  simultaneous  linear  equations  over  real  variables.  This 
model  is  illustrated  in  an  example  problem  in  TP-30.  This  model 
has  also  been  programmed  and  empirical  results  have  been  obtained 
using  it  with  HL-Resolution .  Cooper  has  presented  a  specific 


decision  procedure  (Cooper,  1972)  for  this  model. 


3.0 


p  a  g  e  3  9 


Another  possibility  is  that  no  efficient  decision  procedure  is 


Jl 


known  for  the  model  scheme  SIS(MSF).  This  may  be  because  MSF  is  an 
undecidable  theory,  or  is  decidable  but  is  inherently  complex,  or 
we  may  just  be  ignorant  of  any  existing  decision  procedure.  In 
this  case  we  could  try  pure  theorem  proving  procedures  on  MSP  with 
some  cutoff  on  the  effort  expended.  This  would  introduce  some 
additional  incompleteness  (and  thus  increase  the  degree  of  false 
permissiveness  in  the  associated  semantic  function  of  SIS(MSF))  but 
it  would  be  an  acceptable  way  to  proceed  in  some  circumstances. 
Alternatively  one  might  try  to  model  the  set  of  statements  hSF 
itself  by  another  model  structure.  Notice  that  such  a  second  level 
of  modeling  is  not  just  an  iteration  or  recursion  type  of 
relationship  of  the  first  level  of  modeling.  The  first  level  model 
must  be  sound,  but  can  be  incomplete.  The  second  level  model  need 
neither  be  sound  nor  complete,  just  so  long  as  its  answers  are 
checked  by  the  first  level  model.  At  present  no  work  has  been  ''one 
in  exploring  the  use  of  multiple  levels  of  models. 

The  overall  generality  of  the  paradigm  is  high;  the  model 
language,  L',  can  be  any  first  order  language,  and  the  model  scheme 
can  be  defined  by  any  set  of  satisfiable  formula  of  L'.  The  "DS 
modeling  capability  is,  however,  more  than  general  enough  to  be 
well  matched  to  this  paradigm  for  models. 


references 


Page  ay 


REFERENCES 


1.  Boyer,  R.,  Locking :  A  Restriction  of  Resolution ,  Ph.  C. 
Thesis,  University  Microfilms  International,  Ann  Arbor, 
Michigan,  1971. 

2.  Chang,  C.  L.  and  Lee,  R.  T.  C.,  Symbol ic  Logic  an d 
Mechanical  Theorem  Proving ,  Academic  Press,  New  York,  1973. 


3.  Cooper,  D.  C.,  "Theorem  Proving  in  Arithmetic  without 
Uul tipi ication"  ,  in  Machine  Intel  1  igence  1_,  (eds.  Meltzer,  B . 
and  Michie,  D.),  Edinburgh  University  Press,  Edinburgh,  1972, 
po.  91-99. 


4.  Kleene,  Stephen  Cole,  Mathematical  Logic ,  John  Wiley  & 
Sons,  Inc.,  New  York,  1967. 

5.  Luckham,  D.  "Refinement  Theorems  in  Resolution  Theory", 
Proc .  I  PI  A  Symo .  Automatic  Demonstration ,  Versailles,  France, 
1963,  Spr inger-Verlag ,  New  York,  1970,  pp.  163-190. 


Moore.  R.  C..  Reasoning  From  Incomplete  Knowledge  in  " 


Resolution  Principle",  J.  ACM,  12,  1,  1965,  po.  23-41. 


■r  eferencos 


?? go  A  1 


9.  Sand  ford,  D.  M.,  Hereditary-Lock  Resolution;  A  Resolution 
Rc  f inement  Combining  a  Strong  Vodol  Strategy  wi th  Lock 
Resolution ,  ARPA  S0SAP-TR-3G,  Rutgers  University,  1977  . 

10.  Slagle,  J.  R.,  "Automatic  Theorem  Proving  Tith  Rcnamable 
and  Semantic  Resolution",  J.  ACM,  14,  4,  1957,  pp.  637-697. 

11.  Srinivasan,  C.  V.,  Theorem  Proving  in  the  Meta  Descr ict ion 
System,  AP.PA  SOSAP-TR-2Q,  Rutgers  University,  1975. 


ABLE  OF  ABBREVIATIONS 


.FA. 

H(L) 

H3(L) 

HI 
HLR 
HU  (L) 

LTF 

( V3L ' / MSL\L ' ) 


SR 

SYMT 

"’ir 

•  A  k_  • 


TBS 


TTF 


*3CX* 


3 


The  quantifier  "for  all" 

Set  of  all  Hi's  for  the  language 
Herbrand  base  for  the  language  L 
Herbrand  interpretation 
Hereditary-Lock  Resolution 
Herbrand  universe  for 
the  language  L 

Literal  translation  function 
Nodel  scheme  in  the  language  L 
(in  the  languages  L ' ,  L\L ' ) 
Semantic  Resolution 
Symbol  translation  function 
The  quantifier  "there  exists" 

The  "odel  Strategy 
Term  translation  function 
The  empty  list  (or  set) 
of  literals 
negation  sign 


